FW: [PATCH GCC/pr56124] Don't prefer memory if the source of load operation has side effect
Sorry for the wrong list. -Original Message- From: Bin Cheng [mailto:bin.ch...@arm.com] Sent: Monday, March 25, 2013 3:00 PM To: g...@gcc.gnu.org Subject: [PATCH GCC/pr56124] Don't prefer memory if the source of load operation has side effect Hi, As reported in PR56124, IRA causes redundant reload by preferring to put pseudo which is target of loading in memory. Generally this is good but the case in which the src of loading has side effect. This patch fixes this issue by checking whether source of loading has side effect. I tested the patch on x86/thumb2. Is it OK? Thanks. 2013-03-25 Bin Cheng bin.ch...@arm.com PR target/56124 * ira-costs.c (scan_one_insn): Check whether the source rtx of loading has side effect.Index: gcc/ira-costs.c === --- gcc/ira-costs.c (revision 197029) +++ gcc/ira-costs.c (working copy) @@ -1293,10 +1293,13 @@ scan_one_insn (rtx insn) a memory requiring special instructions to load it, decreasing mem_cost might result in it being loaded using the specialized instruction into a register, then stored into stack and loaded - again from the stack. See PR52208. */ + again from the stack. See PR52208. + + Don't do this if SET_SRC (set) has side effect. See PR56124. */ if (set != 0 REG_P (SET_DEST (set)) MEM_P (SET_SRC (set)) (note = find_reg_note (insn, REG_EQUIV, NULL_RTX)) != NULL_RTX - ((MEM_P (XEXP (note, 0))) + ((MEM_P (XEXP (note, 0)) + !side_effects_p (SET_SRC (set))) || (CONSTANT_P (XEXP (note, 0)) targetm.legitimate_constant_p (GET_MODE (SET_DEST (set)), XEXP (note, 0))
FW: [PATCH GCC]Relax the probability condition in CE pass when optimizing for code size
Wrong list. -Original Message- From: Bin Cheng [mailto:bin.ch...@arm.com] Sent: Monday, March 25, 2013 3:01 PM To: g...@gcc.gnu.org Subject: [PATCH GCC]Relax the probability condition in CE pass when optimizing for code size Hi, The CE pass has been adapted to work with the probability of then/else branches. Now the transformation is done only when it's profitable. Problem is the change affects both performance and size, causing size regression in many cases (especially in C library like Newlib). So this patch relaxes the probability condition when we are optimizing for size. Below is an example from Newlib: unsigned int strlen (const char *); void * realloc (void * __r, unsigned int __size) ; void * memcpy (void *, const void *, unsigned int); int argz_add(char **argz , unsigned int *argz_len , const char *str) { int len_to_add = 0; unsigned int last = *argz_len; if (str == ((void *)0)) return 0; len_to_add = strlen(str) + 1; *argz_len += len_to_add; if(!(*argz = (char *)realloc(*argz, *argz_len))) return 12; memcpy(*argz + last, str, len_to_add); return 0; } The generated assembly for Os/cortex-m0 is like: argz_add: push{r0, r1, r2, r4, r5, r6, r7, lr} mov r6, r0 mov r7, r1 mov r4, r2 ldr r5, [r1] beq .L3 mov r0, r2 bl strlen add r0, r0, #1 add r1, r0, r5 str r0, [sp, #4] str r1, [r7] ldr r0, [r6] bl realloc mov r3, #12 str r0, [r6] cmp r0, #0 beq .L2 add r0, r0, r5 mov r1, r4 ldr r2, [sp, #4] bl memcpy mov r3, #0 b .L2 .L3: mov r3, r2 .L2: mov r0, r3 In which branch/mov instructions around .L3 can be CEed with this patch. During the work I observed passes before combine might interfere with CE pass, so this patch is enabled for ce2/ce3 after combination pass. It is tested on x86/thumb2 for both normal and Os. Is it ok for trunk? 2013-03-25 Bin Cheng bin.ch...@arm.com * ifcvt.c (ifcvt_after_combine): New static variable. (cheap_bb_rtx_cost_p): Set scale to REG_BR_PROB_BASE when optimizing for size. (rest_of_handle_if_conversion, rest_of_handle_if_after_combine): Clear/set the variable ifcvt_after_combine.Index: gcc/ifcvt.c === --- gcc/ifcvt.c (revision 197029) +++ gcc/ifcvt.c (working copy) @@ -67,6 +67,9 @@ #define NULL_BLOCK ((basic_block) NULL) +/* TRUE if after combine pass. */ +static bool ifcvt_after_combine; + /* # of IF-THEN or IF-THEN-ELSE blocks we looked at */ static int num_possible_if_blocks; @@ -144,8 +147,14 @@ cheap_bb_rtx_cost_p (const_basic_block bb, int sca /* Our branch probability/scaling factors are just estimates and don't account for cases where we can get speculation for free and other secondary benefits. So we fudge the scale factor to make speculating - appear a little more profitable. */ + appear a little more profitable when optimizing for performance. */ scale += REG_BR_PROB_BASE / 8; + + /* Set the scale to REG_BR_PROB_BASE to be more agressive when + optimizing for size and after combine pass. */ + if (!optimize_function_for_speed_p (cfun) ifcvt_after_combine) +scale = REG_BR_PROB_BASE; + max_cost *= scale; while (1) @@ -4445,6 +4454,7 @@ gate_handle_if_conversion (void) static unsigned int rest_of_handle_if_conversion (void) { + ifcvt_after_combine = false; if (flag_if_conversion) { if (dump_file) @@ -4494,6 +4504,7 @@ gate_handle_if_after_combine (void) static unsigned int rest_of_handle_if_after_combine (void) { + ifcvt_after_combine = true; if_convert (); return 0; }
Re: [PATCH] PR55033: Fix
Hello, since GCC 4.8.0 is now released it would be very kind if someone can decide if this fix for PR55033 can be integrated in the 4.8 branch and/or trunk. For 32-bit Book E PowerPC targets this bug is a show stopper from my point of view. Even though this is only tertiary GCC platform without maintainer my guess is that it is still a popular GNU/Linux embedded platform. On 02/27/2013 03:40 PM, Sebastian Huber wrote: Hello, so now we have the approval of the Windows specific part. David Edelsohn said that the rs6000 specific part is all right: http://gcc.gnu.org/ml/gcc/2013-02/msg00186.html This patch is available for review since October 2012. What is missing to get this finally committed to GCC 4.8? On 02/24/2013 02:04 PM, Dave Korn wrote: On 12/02/2013 16:11, Sebastian Huber wrote: This patch from Alan Modra fixes a section type conflict error. See also http://gcc.gnu.org/ml/gcc-patches/2012-10/msg02172.html The Windows part is OK. I ran the g++ testsuite (gcc/, not libstdc) with no change before and after. cheers, DaveK -- Sebastian Huber, embedded brains GmbH Address : Dornierstr. 4, D-82178 Puchheim, Germany Phone : +49 89 189 47 41-16 Fax : +49 89 189 47 41-09 E-Mail : sebastian.hu...@embedded-brains.de PGP : Public key available on request. Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.
[Patch, wwwdocs, committed] was: Re: TYPO - http://gcc.gnu.org/gcc-4.8/changes.html
John Franklin wrote: cpmpilation probably meant compilation Thanks for the report. I have fixed it with the attached patch. Tobias Index: gcc-4.8/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/changes.html,v retrieving revision 1.111 diff -u -r1.111 changes.html --- gcc-4.8/changes.html 23 Mar 2013 00:54:43 - 1.111 +++ gcc-4.8/changes.html 25 Mar 2013 08:35:07 - @@ -675,7 +675,7 @@ h3 id=rxRX/h3 ul liThis target will now issue a warning message whenever multiple fast -interrupt handlers are found in the same cpmpilation unit. This feature can +interrupt handlers are found in the same compilation unit. This feature can be turned off by the new code-mno-warn-multiple-fast-interrupts/code command-line option./li /ul
Re: [testsuite] Don't XFAIL gfortran.dg/do_1.f90 (PR fortran/54932)
On Wed, 20 Mar 2013, Rainer Orth wrote: Tobias Burnus bur...@net-b.de writes: Rainer Orth wrote: As discussed in PR fortran/54932, the gfortran.dg/do_1.f90 execution tests recently stated to XPASS at all optimization levels, adding lots of testsuite noise. The following patch removes the xfail, allowing all tests to pass. Tested with the appropriate runtest invocations on x86_64-unknown-linux-gnu, i386-pc-solaris2.11, and sparc-sun-solaris2.11. Ok for mainline and 4.8 branch? Removing the xfail is okay. However, I wonder whether it would be better to leave a reference to the PR in case the failure pops up again. As the code is ill-defined, the failures might pop up in the future and the reference can help with analysis. I prefer to leave the PR reference removed. If the failure crops up again, it's a simple matter of looking at the ChangeLog, svn annotate, or bugzilla to discover the bug, if not, we keep the obsolete comment forever. OK - as is or with an updated reference to the PR. ? For the branch, it is the RMs' call when it can be committed. Jakub, Richard? It's fine now. Thanks, Rchard. Please wait with the committal until GCC's web mail archive works again for gcc-cvs. Done. Thanks. Rainer 2013-03-19 Rainer Orth r...@cebitec.uni-bielefeld.de PR fortran/54932 * gfortran.dg/do_1.f90: Don't xfail. -- Richard Biener rguent...@suse.de SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend
Re: [PING^5] PR 54805: __gthread_tsd* in vxlib-tls.c
On Thu, Mar 21, 2013 at 12:22 AM, Maxim Kuvyrkov ma...@kugelworks.com wrote: On 20/03/2013, at 1:35 AM, rbmj wrote: On 19-Mar-13 03:04, Maxim Kuvyrkov wrote: Will commit to trunk once the server is up. The patch is now committed. Regarding 4.8, we should've really tried to work it out earlier. If you want to pursue backport to 4.8, please attach the log of PPA system rejecting the package The error is: == Finished at 20130318-0642 Build needed 00:14:20, 804796k disk space Function `__gthread_get_tsd_data' implicitly converted to pointer at /build/buildd/gcc-powerpc-wrs-vxworks-4.8.0+0svn196132/libgcc/config/vxlib-tls.c:164 Our automated build log filter detected the problem(s) above that will likely cause your package to segfault on architectures where the size of a pointer is greater than the size of an integer, such as ia64 and amd64. This is often due to a missing function prototype definition. Since use of implicitly converted pointers is always fatal to the application on ia64, they are errors. Please correct them for your next upload. == This problem does not apply on the target (powerpc-wrs-vxworks), where sizeof(int*) == sizeof(int(*)()) == sizeof(int) == 4. However, the build system's filters are too stupid to realize this. Because the warning is spurious really the fact that the automated build system rejects the package is a bug on the build system's part. However, doing it the Right Way is so _easy_... Richard, As release manager, do you have any objections to backporting this patch to 4.8 branch? It affects only VxWorks targets and it is quite harmless (the patch fixes a compilation warning during building GCC for VxWorks targets). It's certainly fine now. Richard. Thanks, -- Maxim Kuvyrkov KugelWorks
Re: [testsuite] Adding -fno-pic to certain tests
On Sun, Mar 24, 2013 at 7:49 AM, Alexander Ivchenko aivch...@gmail.com wrote: Hi, Finally got my hands on that: the attached patch adds the target nonpic for those tests that require the avaiability of functions defined in them. OK for trunk? For testsuite/gcc.target/i386/mmx-1.c you still don't use nopic. Please fix that. Ok with that change. Thanks, Richard. thanks, Alexander 2013/1/10 Richard Biener richard.guent...@gmail.com: On Thu, Jan 10, 2013 at 2:50 PM, Alexander Ivchenko aivch...@gmail.com wrote: Hi, It all begun with discussion here http://gcc.gnu.org/ml/gcc/2012-11/msg00205.html Since -fpic option is turned on by default in Android we have certain test fails. The reason for that is that those tests rely on the availability of functions, defined in them and with -fpic compiler conservatively assumes that they are AVAIL_OVERWRITABLE. The attached patch adding -fno-pic option for tests that fail because of that. I think this should be a dg-requires nopic instead. Otherwise testing with -fPIC/-fno-pic will not show expected differences. Richard. 2013-01-10 Alexander Ivchenko alexander.ivche...@intel.com * g++.dg/ipa/ivinline-1.C: Add -fno-pic option. * g++.dg/ipa/ivinline-2.C: Likewise. * g++.dg/ipa/ivinline-3.C: Likewise. * g++.dg/ipa/ivinline-4.C: Likewise. * g++.dg/ipa/ivinline-5.C: Likewise. * g++.dg/ipa/ivinline-7.C: Likewise. * g++.dg/ipa/ivinline-8.C: Likewise. * g++.dg/ipa/ivinline-9.C: Likewise. * g++.dg/cpp0x/noexcept03.C: Likewise. * gcc.dg/const-1.c: Likewise. * gcc.dg/ipa/pure-const-1.c: Likewise. * gcc.dg/noreturn-8.c: Likewise. * gcc.dg/tree-ssa/ipa-split-5.c: Likewise. * gcc.dg/tree-ssa/loadpre6.c: Likewise. * gcc.c-torture/execute/pr33992.c: Likewise. * gcc.c-torture/execute/pr33992.x: New file. ok for mainline? thanks, Alexander
[PATCH][AARCH64] Restrict m constraint for narrow moves
Hi, Loads and stores with PC-relative addresses are not supported for SHORT modes. This patch fixes a silent bug and implements this restriction for the generic m constraint. Tested successfully on aarch64-none-elf. OK for trunk? Thanks Sofiane - 2013-03-25 Sofiane Naci sofiane.n...@arm.com * config/aarch64/aarch64.c (aarch64_classify_address): Support PC-relative load in SI modes and above only. aarch64-restrict-m-constraint.patch Description: Binary data
Re: extend fwprop optimization
On Sun, Mar 24, 2013 at 5:18 AM, Wei Mi w...@google.com wrote: This is the patch to add the shift truncation in simplify_binary_operation_1. I add a new hook TARGET_SHIFT_COUNT_TRUNCATED which uses enum rtx_code to decide whether we can do shift truncation. I didn't use TARGET_SHIFT_TRUNCATION_MASK in simplify_binary_operation_1 because it uses the macro SHIFT_COUNT_TRUNCATED. If I change SHIFT_COUNT_TRUNCATED to targetm.shift_count_truncated in TARGET_SHIFT_TRUNCATION_MASK, I need to give TARGET_SHIFT_TRUNCATION_MASK a enum rtx_code param, which wasn't trivial to get at many places in existing code. patch.1 ~ patch.4 pass regression and bootstrap on x86_64-unknown-linux-gnu. Doing this might prove dangerous in case some pass may later decide to use an instruction that behaves in different ways. Consider tem = 1 (n 255); // count truncated x = y tem; // bittest instruction bit nr _not_ truncated so if tem is expanded to use a shift instruction which truncates the shift count the explicit and is dropped. If later combine comes around and combines the bit-test to use the bittest instruction which does not implicitely truncate the cound you have generated wrong-code. So we need to make sure any explicit truncation originally in place is kept in the RTL - which means SHIFT_COUNT_TRUNCATED should not exist at all, but instead there would be two patterns for shifts with implicit truncation - one involving the truncation (canonicalized to bitwise and) and one not involving the truncation. Richard. Thanks, Wei. On Sun, Mar 17, 2013 at 12:15 AM, Wei Mi w...@google.com wrote: Hi, On Sat, Mar 16, 2013 at 3:48 PM, Steven Bosscher stevenb@gmail.com wrote: On Tue, Mar 12, 2013 at 8:18 AM, Wei Mi wrote: For the motivational case, I need insn splitting to get the cost right. insn splitting is not very intrusive. All I need is to call split_insns func. It may not look very intrusive, but there's a lot happening in the back ground. You're creating a lot of new RTL, and then just throw it away again. You fake the compiler into thinking you're much deeper in the pipeline than you really are. You're assuming there are no side-effects other than that some insn gets split, but there are back ends where splitters may have side-effects. Ok, then I will remove the split_insns call. Even though I've asked twice now, you still have not explained this motivational case, except to say that there is one. *What* are you trying to do, *what* is not happening without the splits, and *what* happens if you split. Only if you explain that in a lot more detail than I have a motivational case then we can look into what is a proper solution. :-). Sorry, I didn't say it clearly. The motivational case is the one mentioned in the following posts (split_insns changes a (b 63) to a b). http://gcc.gnu.org/ml/gcc/2013-01/msg00181.html http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01144.html If I remove the split_insns call and related cost estimation adjustment, the fwprop 18--22 and 18--23 will punt, because fwprop here looks like a reverse process of cse, the total cost after fwprop change is increased. Def insn 18: Use insn 23 Use insn 22 If we include the split_insns cost estimation adjustment. extra benefit by removing def insn 18 = 5 change[0]: benefit = 0, verified - ok // The cost of insn 22 will not change after fwprop + insn splitting. change[1]: benefit = 0, verified - ok // The insn 23 is the same with insn 22 Total benefit is 5, fwprop will go on. If we remove the split_insns cost estimation adjustment. extra benefit by removing def insn 18 = 5 change[0]: benefit = -4, verified - ok // The costs of insn 22 and insn 23 will increase after fwprop. change[1]: benefit = -4, verified - ok // The insn 23 is the same with insn 22 Total benefit is -3, fwprop will punt. How about adding the (a (b63) == a b) transformation in simplify_binary_operation_1, becuase (a (b63) == a b) is a kind of architecture specific expr simplification? Then fwprop could do the propagation as I expect. The problem with some of the splitters is that they exist to break up RTL from 'expand' to initially keep some pattern together to allow the code transformation passes to handle the pattern as one instruction. This made sense when RTL was the only intermediate representation and splitting too early would inhibit some optimizations. But I would expect most (if not all) such cases to be less relevant because of the GIMPLE middle-end work. The only splitters you can trigger are the pre-reload splitters (all the reload_completed conditions obviously can't trigger if you're splitting from fwprop). Perhaps those splitters can/should run earlier, or be made obsolete by expanding directly to the post-splitting insns. Unfortunately, it's not possible to tell for your case, because you haven't explained it yet... So how
[PATCH] Make LIM depend list a vec
Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2013-03-25 Richard Biener rguent...@suse.de * tree-ssa-loop-im.c (struct depend): Remove. (struct lim_aux_data): Make depends a vec of gimples. (free_lim_aux_data): Adjust. (add_dependency): Likewise. (set_level): Likewise. Index: trunk/gcc/tree-ssa-loop-im.c === *** trunk.orig/gcc/tree-ssa-loop-im.c 2013-03-13 14:19:17.0 +0100 --- trunk/gcc/tree-ssa-loop-im.c2013-03-13 15:38:55.835709837 +0100 *** along with GCC; see the file COPYING3. *** 58,72 something; } */ - /* A type for the list of statements that have to be moved in order to be able -to hoist an invariant computation. */ - - struct depend - { - gimple stmt; - struct depend *next; - }; - /* The auxiliary data kept for each statement. */ struct lim_aux_data --- 58,63 *** struct lim_aux_data *** 85,95 unsigned cost; /* Cost of the computation performed by the statement. */ ! struct depend *depends; /* List of statements that must be also hoisted ! out of the loop when this statement is ! hoisted; i.e. those that define the operands ! of the statement and are inside of the ! MAX_LOOP loop. */ }; /* Maps statements to their lim_aux_data. */ --- 76,86 unsigned cost; /* Cost of the computation performed by the statement. */ ! vecgimple depends;/* Vector of statements that must be also ! hoisted out of the loop when this statement ! is hoisted; i.e. those that define the ! operands of the statement and are inside of ! the MAX_LOOP loop. */ }; /* Maps statements to their lim_aux_data. */ *** get_lim_data (gimple stmt) *** 204,216 static void free_lim_aux_data (struct lim_aux_data *data) { ! struct depend *dep, *next; ! ! for (dep = data-depends; dep; dep = next) ! { ! next = dep-next; ! free (dep); ! } free (data); } --- 195,201 static void free_lim_aux_data (struct lim_aux_data *data) { ! data-depends.release(); free (data); } *** add_dependency (tree def, struct lim_aux *** 475,481 gimple def_stmt = SSA_NAME_DEF_STMT (def); basic_block def_bb = gimple_bb (def_stmt); struct loop *max_loop; - struct depend *dep; struct lim_aux_data *def_data; if (!def_bb) --- 460,465 *** add_dependency (tree def, struct lim_aux *** 500,509 def_bb-loop_father == loop) data-cost += def_data-cost; ! dep = XNEW (struct depend); ! dep-stmt = def_stmt; ! dep-next = data-depends; ! data-depends = dep; return true; } --- 484,490 def_bb-loop_father == loop) data-cost += def_data-cost; ! data-depends.safe_push (def_stmt); return true; } *** static void *** 866,873 set_level (gimple stmt, struct loop *orig_loop, struct loop *level) { struct loop *stmt_loop = gimple_bb (stmt)-loop_father; - struct depend *dep; struct lim_aux_data *lim_data; stmt_loop = find_common_loop (orig_loop, stmt_loop); lim_data = get_lim_data (stmt); --- 847,855 set_level (gimple stmt, struct loop *orig_loop, struct loop *level) { struct loop *stmt_loop = gimple_bb (stmt)-loop_father; struct lim_aux_data *lim_data; + gimple dep_stmt; + unsigned i; stmt_loop = find_common_loop (orig_loop, stmt_loop); lim_data = get_lim_data (stmt); *** set_level (gimple stmt, struct loop *ori *** 881,888 || flow_loop_nested_p (lim_data-max_loop, level)); lim_data-tgt_loop = level; ! for (dep = lim_data-depends; dep; dep = dep-next) ! set_level (dep-stmt, orig_loop, level); } /* Determines an outermost loop from that we want to hoist the statement STMT. --- 863,870 || flow_loop_nested_p (lim_data-max_loop, level)); lim_data-tgt_loop = level; ! FOR_EACH_VEC_ELT (lim_data-depends, i, dep_stmt) ! set_level (dep_stmt, orig_loop, level); } /* Determines an outermost loop from that we want to hoist the statement STMT.
Re: [PATCH][AARCH64] Restrict m constraint for narrow moves
On 25/03/13 09:32, Sofiane Naci wrote: Hi, Loads and stores with PC-relative addresses are not supported for SHORT modes. This patch fixes a silent bug and implements this restriction for the generic m constraint. Tested successfully on aarch64-none-elf. OK for trunk? Thanks Sofiane - 2013-03-25 Sofiane Naci sofiane.n...@arm.com * config/aarch64/aarch64.c (aarch64_classify_address): Support PC-relative load in SI modes and above only. OK. This is also an issue in 4.8, please back port the patch. Cheers /Marcus
[PATCH] Rest of LIM TLC
This is the rest of my queued LIM TLC (apart from limiting it's dependence checks). Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. 2013-03-25 Richard Biener rguent...@suse.de * tree-ssa-loop-im.c (struct mem_ref): Use bitmap_head instead of bitmap. (memory_references): Likewise. (outermost_indep_loop, mem_ref_alloc, mark_ref_stored, gather_mem_refs_stmt, record_dep_loop, ref_indep_loop_p_1, ref_indep_loop_p_2, find_refs_for_sm): Adjust. (gather_mem_refs_in_loops): Fold into ... (analyze_memory_references): ... this. Move initialization to tree_ssa_lim_initialize. (fill_always_executed_in): Rename to ... (fill_always_executed_in_1): ... this. (fill_always_executed_in): Move contains_call computation to this new function from ... (tree_ssa_lim_initialize): ... here. (tree_ssa_lim): Call fill_always_executed_in. Index: gcc/tree-ssa-loop-im.c === --- gcc/tree-ssa-loop-im.c (revision 197031) +++ gcc/tree-ssa-loop-im.c (working copy) @@ -108,7 +108,7 @@ typedef struct mem_ref query meta-data. */ ao_ref mem; - bitmap stored; /* The set of loops in that this memory location + bitmap_head stored; /* The set of loops in that this memory location is stored to. */ vecvecmem_ref_loc accesses_in_loop; /* The locations of the accesses. Vector @@ -117,14 +117,14 @@ typedef struct mem_ref /* The following sets are computed on demand. We keep both set and its complement, so that we know whether the information was already computed or not. */ - bitmap indep_loop; /* The set of loops in that the memory + bitmap_head indep_loop; /* The set of loops in that the memory reference is independent, meaning: If it is stored in the loop, this store is independent on all other loads and stores. If it is only loaded, then it is independent on all stores in the loop. */ - bitmap dep_loop; /* The complement of INDEP_LOOP. */ + bitmap_head dep_loop;/* The complement of INDEP_LOOP. */ } *mem_ref_p; /* We use two bits per loop in the ref-{in,}dep_loop bitmaps, the first @@ -146,13 +146,13 @@ static struct vecmem_ref_p refs_list; /* The set of memory references accessed in each loop. */ - vecbitmap refs_in_loop; + vecbitmap_head refs_in_loop; /* The set of memory references stored in each loop. */ - vecbitmap refs_stored_in_loop; + vecbitmap_head refs_stored_in_loop; /* The set of memory references stored in each loop, including subloops . */ - vecbitmap all_refs_stored_in_loop; + vecbitmap_head all_refs_stored_in_loop; /* Cache for expanding memory addresses. */ struct pointer_map_t *ttae_cache; @@ -584,13 +584,13 @@ outermost_indep_loop (struct loop *outer { struct loop *aloop; - if (bitmap_bit_p (ref-stored, loop-num)) + if (bitmap_bit_p (ref-stored, loop-num)) return NULL; for (aloop = outer; aloop != loop; aloop = superloop_at_depth (loop, loop_depth (aloop) + 1)) -if (!bitmap_bit_p (ref-stored, aloop-num) +if (!bitmap_bit_p (ref-stored, aloop-num) ref_indep_loop_p (aloop, ref)) return aloop; @@ -1457,9 +1457,9 @@ mem_ref_alloc (tree mem, unsigned hash, ao_ref_init (ref-mem, mem); ref-id = id; ref-hash = hash; - ref-stored = BITMAP_ALLOC (lim_bitmap_obstack); - ref-indep_loop = BITMAP_ALLOC (lim_bitmap_obstack); - ref-dep_loop = BITMAP_ALLOC (lim_bitmap_obstack); + bitmap_initialize (ref-stored, lim_bitmap_obstack); + bitmap_initialize (ref-indep_loop, lim_bitmap_obstack); + bitmap_initialize (ref-dep_loop, lim_bitmap_obstack); ref-accesses_in_loop.create (0); return ref; @@ -1487,11 +1487,9 @@ record_mem_ref_loc (mem_ref_p ref, struc static void mark_ref_stored (mem_ref_p ref, struct loop *loop) { - for (; - loop != current_loops-tree_root -!bitmap_bit_p (ref-stored, loop-num); - loop = loop_outer (loop)) -bitmap_set_bit (ref-stored, loop-num); + while (loop != current_loops-tree_root + bitmap_set_bit (ref-stored, loop-num)) +loop = loop_outer (loop); } /* Gathers memory references in statement STMT in LOOP, storing the @@ -1552,10 +1550,10 @@ gather_mem_refs_stmt (struct loop *loop, record_mem_ref_loc (ref, loop, stmt, mem); } - bitmap_set_bit (memory_accesses.refs_in_loop[loop-num], ref-id); + bitmap_set_bit (memory_accesses.refs_in_loop[loop-num], ref-id); if (is_stored) { - bitmap_set_bit
[PATCH] Fix PR56689
This fixes VRP to properly fixup loops when it removes edges from the CFG. Bootstrap and regtest running on x86_64-unknown-linux-gnu. Richard. 2013-03-25 Richard Biener rguent...@suse.de PR tree-optimization/56689 * tree-vrp.c (execute_vrp): Mark loops for fixup if we removed any edge. * gcc.dg/torture/pr56689.c: New testcase. Index: gcc/tree-vrp.c === --- gcc/tree-vrp.c (revision 197029) +++ gcc/tree-vrp.c (working copy) @@ -9329,7 +9329,11 @@ execute_vrp (void) } if (to_remove_edges.length () 0) -free_dominance_info (CDI_DOMINATORS); +{ + free_dominance_info (CDI_DOMINATORS); + if (current_loops) + loops_state_set (LOOPS_NEED_FIXUP); +} to_remove_edges.release (); to_update_switch_stmts.release (); Index: gcc/testsuite/gcc.dg/torture/pr56689.c === --- gcc/testsuite/gcc.dg/torture/pr56689.c (revision 0) +++ gcc/testsuite/gcc.dg/torture/pr56689.c (working copy) @@ -0,0 +1,46 @@ +/* { dg-do compile } */ + +extern int baz (); +extern void bar (void); +extern void noret (void) __attribute__ ((__noreturn__)); + +void +fix_register (const char *name, int fixed, int call_used, int nregs) +{ + int i; + int reg; + + if ((reg = baz ()) = 0) +{ + for (i = reg; i nregs; i++) + { + if ((i == 15 || i == 11) (fixed == 0 || call_used == 0)) + { + switch (fixed) + { + case 0: + switch (call_used) + { + case 1: + bar (); + break; + default: + (noret ()); + } + case 1: + switch (call_used) + { + case 1: + break; + case 0: + default: + (noret ()); + } + break; + default: + (noret ()); + } + } + } +} +}
Re: [PATCH] libgcc: Add DWARF info to aeabi_ldivmod and aeabi_uldivmod
On 03/18/13 19:20, Meador Inge wrote: Ping. On 03/05/2013 12:15 PM, Meador Inge wrote: Hi All, This patch fixes a minor annoyance that causes backtraces to disappear inside of aeabi_ldivmod and aeabi_uldivmod due to the lack of appropriate DWARF information. I fixed the problem by adding the necessary cfi_* macros in these functions. OK? This is OK . R 2013-03-05 Meador Inge mead...@codesourcery.com * config/arm/bpabi.S (aeabi_ldivmod): Add DWARF information for computing the location of the link register. (aeabi_uldivmod): Ditto. Index: libgcc/config/arm/bpabi.S === --- libgcc/config/arm/bpabi.S (revision 196470) +++ libgcc/config/arm/bpabi.S (working copy) @@ -123,6 +123,7 @@ ARM_FUNC_START aeabi_ulcmp #ifdef L_aeabi_ldivmod ARM_FUNC_START aeabi_ldivmod + cfi_start __aeabi_ldivmod, LSYM(Lend_aeabi_ldivmod) test_div_by_zero signed sub sp, sp, #8 @@ -132,17 +133,20 @@ ARM_FUNC_START aeabi_ldivmod #else do_push {sp, lr} #endif +98:cfi_push 98b - __aeabi_ldivmod, 0xe, -0xc, 0x10 bl SYM(__gnu_ldivmod_helper) __PLT__ ldr lr, [sp, #4] add sp, sp, #8 do_pop {r2, r3} RET + cfi_end LSYM(Lend_aeabi_ldivmod) #endif /* L_aeabi_ldivmod */ #ifdef L_aeabi_uldivmod ARM_FUNC_START aeabi_uldivmod + cfi_start __aeabi_uldivmod, LSYM(Lend_aeabi_uldivmod) test_div_by_zero unsigned sub sp, sp, #8 @@ -152,11 +156,13 @@ ARM_FUNC_START aeabi_uldivmod #else do_push {sp, lr} #endif +98:cfi_push 98b - __aeabi_uldivmod, 0xe, -0xc, 0x10 bl SYM(__gnu_uldivmod_helper) __PLT__ ldr lr, [sp, #4] add sp, sp, #8 do_pop {r2, r3} RET - + cfi_end LSYM(Lend_aeabi_uldivmod) + #endif /* L_aeabi_divmod */
Re: [PATCH][ARM] Handle unordered comparison cases in NEON vcond
On 03/18/13 12:09, Kyrylo Tkachov wrote: Hi all, Given code: #define MAX(a, b) (a b ? a : b) void foo (int ilast, float* w, float* w2) { int i; for (i = 0; i ilast; ++i) { w[i] = MAX (0.0f, w2[i]); } } compiled with -O1 -funsafe-math-optimizations -ftree-vectorize -mfpu=neon -mfloat-abi=hard on arm-none-eabi will cause an ICE when trying to expand the vcond pattern. Looking at the vcond pattern in neon.md, the predicate for the comparison operator (arm_comparison_operator) uses maybe_get_arm_condition_code which is not needed for vcond since we don't care about the ARM condition code (we can handle all the comparison cases ourselves in the expander). Changing the predicate to comparison_operator allows the expander to proceed but it ICEs again because the pattern doesn't handle the floating point unordered cases! (i.e. UNGT, UNORDERED, UNLE etc). Adding support for the unordered cases is very similar to the aarch64 port added here: http://gcc.gnu.org/ml/gcc-patches/2013-01/msg00957.html This patch adapts that code to the arm port. Added the testcase that exposed the ICE initially and also the UNORDERED and LTGT variations of it. No regressions on arm-none-eabi. Ok for trunk? Please file a ticket in bugzilla with this testcase and triage with respect to other release branches as well as this is likely to show up in 4.8 as well. Can you please also check what happens with 4.6 and 4.7 ? This is OK for trunk with the appropriate bug id in the changelog(s) regards Ramana Thanks, Kyrill gcc/ChangeLog 2013-03-18 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/arm/iterators.md (v_cmp_result): New mode attribute. * config/arm/neon.md (vcondmodemode): Handle unordered cases. gcc/testsuite/ChangeLog 2013-03-18 Kyrylo Tkachov kyrylo.tkac...@arm.com * gcc.target/arm/neon-vcond-gt.c: New test. * gcc.target/arm/neon-vcond-ltgt.c: Likewise. * gcc.target/arm/neon-vcond-unordered.c: Likewise.
Fix bug with simple returns on cc0 targets
Hi, for a private port which both is a cc0 target and has conditional returns, emit_use_return_register_into_block will try to emit the use return register sequence between a cc0 setter and a cc0 user. Fixed thusly, tested on x86_64-suse-linux, applied on the mainline. 2013-03-25 Eric Botcazou ebotca...@adacore.com * function.c (emit_use_return_register_into_block): On cc0 targets, do not emit the sequence between cc0 setter and user. -- Eric BotcazouIndex: function.c === --- function.c (revision 197003) +++ function.c (working copy) @@ -5598,12 +5598,17 @@ prepare_shrink_wrap (basic_block entry_b static void emit_use_return_register_into_block (basic_block bb) { - rtx seq; + rtx seq, insn; start_sequence (); use_return_register (); seq = get_insns (); end_sequence (); - emit_insn_before (seq, BB_END (bb)); + insn = BB_END (bb); +#ifdef HAVE_cc0 + if (reg_mentioned_p (cc0_rtx, PATTERN (insn))) +insn = prev_cc0_setter (insn); +#endif + emit_insn_before (seq, insn); }
Fix bug in reload_combine with simple returns
Hi, another bug discovered for a cc0 target with conditional returns: the special code in reload_combine dealing with returns hasn't been updated for simple returns. Fixed thusly, tested on x86_64-suse-linux, applied on the mainline. 2013-03-25 Eric Botcazou ebotca...@adacore.com * postreload.c (reload_combine): Fix code detecting returns. -- Eric BotcazouIndex: postreload.c === --- postreload.c (revision 196816) +++ postreload.c (working copy) @@ -1387,7 +1387,7 @@ reload_combine (void) } } - if (control_flow_insn GET_CODE (PATTERN (insn)) != RETURN) + if (control_flow_insn !ANY_RETURN_P (PATTERN (insn))) { /* Non-spill registers might be used at the call destination in some unknown fashion, so we have to mark the unknown use. */ @@ -1395,13 +1395,19 @@ reload_combine (void) if ((condjump_p (insn) || condjump_in_parallel_p (insn)) JUMP_LABEL (insn)) - live = LABEL_LIVE (JUMP_LABEL (insn)); + { + if (ANY_RETURN_P (JUMP_LABEL (insn))) + live = NULL; + else + live = LABEL_LIVE (JUMP_LABEL (insn)); + } else live = ever_live_at_start; - for (r = 0; r FIRST_PSEUDO_REGISTER; r++) - if (TEST_HARD_REG_BIT (*live, r)) - reg_state[r].use_index = -1; + if (live) + for (r = 0; r FIRST_PSEUDO_REGISTER; r++) + if (TEST_HARD_REG_BIT (*live, r)) + reg_state[r].use_index = -1; } reload_combine_note_use (PATTERN (insn), insn, reload_combine_ruid,
Fill more delay slots in conditional returns
Hi, for a private port with conditional returns and delay slots, only the simple algorithm (fill_simple_delay_slots) is able to fill the slots. It's because get_branch_condition just punts on conditional returns. Fixed thusly. While I investigated this, I realized that the block of code in fill_simple_delay_slots between line 2097 and line 2274 is dead for JUMP insns (and has been so for a long time, which is consistent with various comments in the code, for example the head comment of fill_eager_delay_slots) so the patch also cleans it up (modulo the formatting to make the patch readable). Jeff, any objections? Tested on SPARC/Solaris, no difference in the generated code at -O2 for the gcc.c-torture/compile testsuite. 2013-03-25 Eric Botcazou ebotca...@adacore.com * reorg.c (get_branch_condition): Deal with conditional returns. (fill_simple_delay_slots): Remove dead code dealing with jumps. -- Eric BotcazouIndex: reorg.c === --- reorg.c (revision 196816) +++ reorg.c (working copy) @@ -921,8 +921,8 @@ get_branch_condition (rtx insn, rtx targ if (condjump_in_parallel_p (insn)) pat = XVECEXP (pat, 0, 0); - if (ANY_RETURN_P (pat)) -return pat == target ? const_true_rtx : 0; + if (ANY_RETURN_P (pat) pat == target) +return const_true_rtx; if (GET_CODE (pat) != SET || SET_DEST (pat) != pc_rtx) return 0; @@ -933,14 +933,16 @@ get_branch_condition (rtx insn, rtx targ else if (GET_CODE (src) == IF_THEN_ELSE XEXP (src, 2) == pc_rtx - GET_CODE (XEXP (src, 1)) == LABEL_REF - XEXP (XEXP (src, 1), 0) == target) + ((GET_CODE (XEXP (src, 1)) == LABEL_REF + XEXP (XEXP (src, 1), 0) == target) + || (ANY_RETURN_P (XEXP (src, 1)) XEXP (src, 1) == target))) return XEXP (src, 0); else if (GET_CODE (src) == IF_THEN_ELSE XEXP (src, 1) == pc_rtx - GET_CODE (XEXP (src, 2)) == LABEL_REF - XEXP (XEXP (src, 2), 0) == target) + ((GET_CODE (XEXP (src, 2)) == LABEL_REF + XEXP (XEXP (src, 2), 0) == target) + || (ANY_RETURN_P (XEXP (src, 2)) XEXP (src, 2) == target))) { enum rtx_code rev; rev = reversed_comparison_code (XEXP (src, 0), insn); @@ -2129,35 +2131,19 @@ fill_simple_delay_slots (int non_jumps_p Presumably, we should also check to see if we could get back to this function via `setjmp'. */ ! can_throw_internal (insn) - (!JUMP_P (insn) - || ((condjump_p (insn) || condjump_in_parallel_p (insn)) - ! simplejump_p (insn) - !ANY_RETURN_P (JUMP_LABEL (insn) + !JUMP_P (insn)) { - /* Invariant: If insn is a JUMP_INSN, the insn's jump - label. Otherwise, zero. */ - rtx target = 0; int maybe_never = 0; rtx pat, trial_delay; CLEAR_RESOURCE (needed); CLEAR_RESOURCE (set); + mark_set_resources (insn, set, 0, MARK_SRC_DEST_CALL); + mark_referenced_resources (insn, needed, true); if (CALL_P (insn)) - { - mark_set_resources (insn, set, 0, MARK_SRC_DEST_CALL); - mark_referenced_resources (insn, needed, true); - maybe_never = 1; - } - else - { - mark_set_resources (insn, set, 0, MARK_SRC_DEST_CALL); - mark_referenced_resources (insn, needed, true); - if (JUMP_P (insn)) - target = JUMP_LABEL (insn); - } + maybe_never = 1; - if (target == 0 || ANY_RETURN_P (target)) for (trial = next_nonnote_insn (insn); !stop_search_p (trial, 1); trial = next_trial) { @@ -2217,9 +2203,8 @@ fill_simple_delay_slots (int non_jumps_p slot since these insns could clobber the condition code. */ set.cc = 1; - /* If this is a call or jump, we might not get here. */ - if (CALL_P (trial_delay) - || JUMP_P (trial_delay)) + /* If this is a call, we might not get here. */ + if (CALL_P (trial_delay)) maybe_never = 1; } @@ -2232,7 +2217,6 @@ fill_simple_delay_slots (int non_jumps_p trial jump_to_label_p (trial) simplejump_p (trial) - (target == 0 || JUMP_LABEL (trial) == target) (next_trial = next_active_insn (JUMP_LABEL (trial))) != 0 ! (NONJUMP_INSN_P (next_trial) GET_CODE (PATTERN (next_trial)) == SEQUENCE) @@ -2264,11 +2248,6 @@ fill_simple_delay_slots (int non_jumps_p delay_list); slots_filled++; reorg_redirect_jump (trial, new_label); - - /* If we merged because we both jumped to the same place, - redirect the original insn also. */ - if (target) - reorg_redirect_jump (insn, new_label); } } }
Do not disable -fomit-frame-pointer on !ACCUMULATE_OUTGOING_ARGS targets
Hi, process_options has had these lines for a couple of releases: /* ??? Unwind info is not correct around the CFG unless either a frame pointer is present or A_O_A is set. Fixing this requires rewriting unwind info generation to be aware of the CFG and propagating states around edges. */ if (flag_unwind_tables !ACCUMULATE_OUTGOING_ARGS flag_omit_frame_pointer) { warning (0, unwind tables currently require a frame pointer for correctness); flag_omit_frame_pointer = 0; } I think that's too broad: for example, it's a common pattern for a target to really enable -fomit-frame-pointer only if the stack pointer is unchanging in the function; in this case, the unwind info will be correct even if !A_O_A. So I'm proposing to disable -fomit-frame-pointer on a per-function basis instead in ira_setup_eliminable_regset. Tested on x86_64-suse-linux, OK for the mainline? 2013-03-25 Eric Botcazou ebotca...@adacore.com * toplev.c (process_options): Do not disable -fomit-frame-pointer on a general basis if unwind info is requested and ACCUMULATE_OUTGOING_ARGS is not enabled. * ira.c (ira_setup_eliminable_regset): Instead disable it only on a per function basis if the stack pointer is not unchanging in the function. -- Eric BotcazouIndex: ira.c === --- ira.c (revision 196816) +++ ira.c (working copy) @@ -1875,6 +1875,15 @@ ira_setup_eliminable_regset (bool from_i || crtl-stack_realign_needed || targetm.frame_pointer_required ()); + /* ??? Unwind info is not correct around the CFG unless either a frame + pointer is present or A_O_A is set or the stack pointer is unchanging. + Fixing this requires rewriting unwind info generation to be aware of + the CFG and propagating states around edges. */ + if (flag_unwind_tables + !ACCUMULATE_OUTGOING_ARGS + !crtl-sp_is_unchanging) +frame_pointer_needed = true; + if (from_ira_p ira_use_lra_p) /* It can change FRAME_POINTER_NEEDED. We call it only from IRA because it is expensive. */ Index: toplev.c === --- toplev.c (revision 196816) +++ toplev.c (working copy) @@ -1527,18 +1527,6 @@ process_options (void) if (!flag_stack_protect) warn_stack_protect = 0; - /* ??? Unwind info is not correct around the CFG unless either a frame - pointer is present or A_O_A is set. Fixing this requires rewriting - unwind info generation to be aware of the CFG and propagating states - around edges. */ - if (flag_unwind_tables !ACCUMULATE_OUTGOING_ARGS - flag_omit_frame_pointer) -{ - warning (0, unwind tables currently require a frame pointer - for correctness); - flag_omit_frame_pointer = 0; -} - /* Address Sanitizer needs porting to each target architecture. */ if (flag_asan (targetm.asan_shadow_offset == NULL
[testsuite] Cap VLEN in gcc.c-torture/execute/20011008-3.c
Hi, gcc.c-torture/execute/20011008-3.c has these lines: #ifndef STACK_SIZE #define VLEN1235 #else #define VLEN (STACK_SIZE/10) #endif which means that VLEN is _not_ capped if STACK_SIZE is defined, which goes against the very purpose of STACK_SIZE in the testing framework. Fixed thusly, tested on x86_64-suse-linux, OK for the mainline? 2013-03-25 Eric Botcazou ebotca...@adacore.com * gcc.c-torture/execute/20011008-3.c: Cap VLEN with STACK_SIZE too. -- Eric BotcazouIndex: gcc.c-torture/execute/20011008-3.c === --- gcc.c-torture/execute/20011008-3.c (revision 196816) +++ gcc.c-torture/execute/20011008-3.c (working copy) @@ -81,10 +81,10 @@ __db_txnlist_lsnadd(int val, DB_TXNLIST return val; } -#ifndef STACK_SIZE -#define VLEN 1235 -#else +#if defined (STACK_SIZE) STACK_SIZE 12350 #define VLEN (STACK_SIZE/10) +#else +#define VLEN 1235 #endif int main (void)
Re: [PATCH GCC]Relax the probability condition in CE pass when optimizing for code size
Quoting Bin Cheng bin.ch...@arm.com: During the work I observed passes before combine might interfere with CE pass, so this patch is enabled for ce2/ce3 after combination pass. It is tested on x86/thumb2 for both normal and Os. Is it ok for trunk? There are bound to be target and application specific variations on which scaling factors work best. 2013-03-25 Bin Cheng bin.ch...@arm.com * ifcvt.c (ifcvt_after_combine): New static variable. It would make more sense to pass in the scale factor as a an argument to if_convert. And get the respective values from a set of gcc parameters, so they can be tweaked by ports and/or by a user/ML learning framework (e.g. Milepost) supplying the appropriate --param option.
Re: [patch] Unified debug dump function names.
Lawrence == Lawrence Crowl cr...@googlers.com writes: Lawrence This patch is somewhat different from the original plan at Lawrence gcc.gnu.org/wiki/cxx-conversion/debugging-dumps. The reason Lawrence is that gdb has an incomplete implementation of C++ call syntax; Lawrence requiring explicit specification of template arguments and explicit Lawrence specification of function arguments even when they have default Lawrence values. Note that the latter is because GCC doesn't emit this information. As for the former ... we have a patch that works in some cases, but it's actually unclear to me how well the debugger can do in general here. We haven't put it in since it seems better to require users to be explicit than to silently do the wrong thing in some cases. Tom
[PATCH] Fix PR56694
This fixes PR56694 - the code keeping BLOCKs live is not looking at the EH tree for references. In the must-not-throw failure_loc such references can now appear. Fixed by reverting that to 4.7 behavior. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk and 4.8 branch. Richard. 2013-03-25 Richard Biener rguent...@suse.de PR middle-end/56694 * tree-eh.c (lower_eh_must_not_throw): Strip BLOCKs from the must-not-throw stmt location. * g++.dg/torture/pr56694.C: New testcase. Index: gcc/tree-eh.c === *** gcc/tree-eh.c (revision 197029) --- gcc/tree-eh.c (working copy) *** lower_eh_must_not_throw (struct leh_stat *** 1855,1861 this_region = gen_eh_region_must_not_throw (state-cur_region); this_region-u.must_not_throw.failure_decl = gimple_eh_must_not_throw_fndecl (inner); ! this_region-u.must_not_throw.failure_loc = gimple_location (tp); /* In order to get mangling applied to this decl, we must mark it used now. Otherwise, pass_ipa_free_lang_data won't think it --- 1855,1862 this_region = gen_eh_region_must_not_throw (state-cur_region); this_region-u.must_not_throw.failure_decl = gimple_eh_must_not_throw_fndecl (inner); ! this_region-u.must_not_throw.failure_loc ! = LOCATION_LOCUS (gimple_location (tp)); /* In order to get mangling applied to this decl, we must mark it used now. Otherwise, pass_ipa_free_lang_data won't think it Index: gcc/testsuite/g++.dg/torture/pr56694.C === *** gcc/testsuite/g++.dg/torture/pr56694.C (revision 0) --- gcc/testsuite/g++.dg/torture/pr56694.C (working copy) *** *** 0 --- 1,30 + // { dg-do compile } + // { dg-options -fopenmp } + + class GException { + public: + class vector_mismatch { + public: + vector_mismatch(int size1, int size2); + }; + }; + class GVector{ + public: + GVector operator+=(const GVector v); + int m_num; + double* m_data; + }; + inline GVector GVector::operator+= (const GVector v) + { + if (m_num != v.m_num) + throw GException::vector_mismatch(m_num, v.m_num); + for (int i = 0; i m_num; ++i) m_data[i] += v.m_data[i]; + }; + void eval(GVector* m_gradient, GVector* vect_cpy_grad, int n) + { + #pragma omp sections + { + for (int i = 0; i n; ++i) + *m_gradient += vect_cpy_grad[i]; + } + }
Re: [Patch, Fortran] C Binding - module+intrinsic cleanup+bug fixes
Le 25/03/2013 11:11, Tobias Burnus a écrit : Is the updated patch now okay for the trunk? (It was build and regtested on x86-64-gnu-linux.) OK. Many thanks. Mikael
RE: [PATCH][ARM] use vsel instruction for floating point conditional moves in ARMv8
-Original Message- From: Ramana Radhakrishnan Sent: 18 February 2013 11:51 To: Kyrylo Tkachov Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw Subject: Re: [PATCH][ARM] use vsel instruction for floating point conditional moves in ARMv8 On 01/30/13 09:24, Kyrylo Tkachov wrote: Hi all, This patch uses the new ARMv8 AArch32 vsel instruction to implement conditional moves of floating point numbers. For example, an instruction of the form: vselcond.f32 s0, s1, s2 means s0 := cond ? s1 : s2 This can be useful, among other places, in Thumb2 because it doesn't require an enclosing IT block. A small catch: The condition code used in vsel can only be one of {GE, GT, EQ, VS}. If we want to use their negations {LT, LE, NE, VC} we just flip the source operands. A new predicate is introduced that checks that the comparison yields an ARM condition code in the set {GE, GT, EQ, VS, LT, LE, NE, VC}. New compilation tests are added. They pass on a model and no new regressions on arm-none-eabi with qemu. Ok for trunk? Ok for stage1 4.9. Hi Ramana, Thanks for the review. Re-tested on arm-none-eabi against current trunk and applied as r197052. Ramana Thanks, Kyrill Thanks, Kyrill gcc/ChangeLog 2013-01-30 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/arm/arm.md (f_sels, f_seld): New types. (*cmovmode): New pattern. * config/arm/predicates.md (arm_vsel_comparison_operator): New predicate. gcc/testsuite/ChangeLog 2013-01-30 Kyrylo Tkachov kyrylo.tkac...@arm.com * gcc.target/arm/vseleqdf.c: New test. * gcc.target/arm/vseleqsf.c: Likewise. * gcc.target/arm/vselgedf.c: Likewise. * gcc.target/arm/vselgesf.c: Likewise. * gcc.target/arm/vselgtdf.c: Likewise. * gcc.target/arm/vselgtsf.c: Likewise. * gcc.target/arm/vselledf.c: Likewise. * gcc.target/arm/vsellesf.c: Likewise. * gcc.target/arm/vselltdf.c: Likewise. * gcc.target/arm/vselltsf.c: Likewise. * gcc.target/arm/vselnedf.c: Likewise. * gcc.target/arm/vselnesf.c: Likewise. * gcc.target/arm/vselvcdf.c: Likewise. * gcc.target/arm/vselvcsf.c: Likewise. * gcc.target/arm/vselvsdf.c: Likewise. * gcc.target/arm/vselvssf.c: Likewise.
Re: Do not disable -fomit-frame-pointer on !ACCUMULATE_OUTGOING_ARGS targets
On 03/25/2013 04:26 AM, Eric Botcazou wrote: process_options has had these lines for a couple of releases: /* ??? Unwind info is not correct around the CFG unless either a frame pointer is present or A_O_A is set. Fixing this requires rewriting unwind info generation to be aware of the CFG and propagating states around edges. */ Heh. We've actually fixed this now -- unwind info generation aware of the cfg is exactly what pass_dwarf2_frame does. So I guess this comment has been out of date since gcc 4.7. r~
*ping* [patch, fortran, 4.9] Dependency and string length calculation improvements
*ping* Slightly updated patch below, with a better test case as suggested by Dominique. OK for trunk? 2013-03-16 Thomas Koenig tkoe...@gcc.gnu.org PR fortran/45159 * gfortran.h (gfc_dep_difference): Add prototype. * dependency.c (discard_nops): New function. (gfc_dep_difference): New function. (check_section_vs_section): Use gfc_dep_difference to calculate the difference of starting indices. * trans-expr.c (gfc_conv_substring): Use gfc_dep_difference to calculate the length of substrings where possible. 2013-03-16 Thomas Koenig tkoe...@gcc.gnu.org PR fortran/45159 * gfortran.dg/string_length_2.f90: New test. * gfortran.dg/dependency_41.f90: New test. Index: gfortran.h === --- gfortran.h (Revision 196574) +++ gfortran.h (Arbeitskopie) @@ -2959,6 +2959,7 @@ gfc_namespace* gfc_build_block_ns (gfc_namespace * /* dependency.c */ int gfc_dep_compare_functions (gfc_expr *, gfc_expr *, bool); int gfc_dep_compare_expr (gfc_expr *, gfc_expr *); +bool gfc_dep_difference (gfc_expr *, gfc_expr *, mpz_t *); /* check.c */ gfc_try gfc_check_same_strlen (const gfc_expr*, const gfc_expr*, const char*); Index: dependency.c === --- dependency.c (Revision 196574) +++ dependency.c (Arbeitskopie) @@ -500,7 +500,270 @@ gfc_dep_compare_expr (gfc_expr *e1, gfc_expr *e2) } } +/* Helper function to look through parens and unary plus. */ +static gfc_expr* +discard_nops (gfc_expr *e) +{ + + while (e e-expr_type == EXPR_OP + (e-value.op.op == INTRINSIC_UPLUS + || e-value.op.op == INTRINSIC_PARENTHESES)) +e = e-value.op.op1; + + return e; +} +/* Return the difference between two expressions. Integer expressions of + the form + + X + constant, X - constant and constant + X + + are handled. Return true on success, false on failure. result is assumed + to be uninitialized on entry, and will be initialized on success. +*/ + +bool +gfc_dep_difference (gfc_expr *e1, gfc_expr *e2, mpz_t *result) +{ + gfc_expr *e1_op1, *e1_op2, *e2_op1, *e2_op2; + + if (e1 == NULL || e2 == NULL) +return false; + + if (e1-ts.type != BT_INTEGER || e2-ts.type != BT_INTEGER) +return false; + + e1 = discard_nops (e1); + e2 = discard_nops (e2); + + /* Inizialize tentatively, clear if we don't return anything. */ + mpz_init (*result); + + /* Case 1: c1 - c2 = c1 - c2, trivially. */ + + if (e1-expr_type == EXPR_CONSTANT e2-expr_type == EXPR_CONSTANT) +{ + mpz_sub (*result, e1-value.integer, e2-value.integer); + return true; +} + + if (e1-expr_type == EXPR_OP e1-value.op.op == INTRINSIC_PLUS) +{ + e1_op1 = discard_nops (e1-value.op.op1); + e1_op2 = discard_nops (e1-value.op.op2); + + /* Case 2: (X + c1) - X = c1. */ + if (e1_op2-expr_type == EXPR_CONSTANT + gfc_dep_compare_expr (e1_op1, e2) == 0) + { + mpz_set (*result, e1_op2-value.integer); + return true; + } + + /* Case 3: (c1 + X) - X = c1. */ + if (e1_op1-expr_type == EXPR_CONSTANT + gfc_dep_compare_expr (e1_op2, e2) == 0) + { + mpz_set (*result, e1_op1-value.integer); + return true; + } + + if (e2-expr_type == EXPR_OP e2-value.op.op == INTRINSIC_PLUS) + { + e2_op1 = discard_nops (e2-value.op.op1); + e2_op2 = discard_nops (e2-value.op.op2); + + if (e1_op2-expr_type == EXPR_CONSTANT) + { + /* Case 4: X + c1 - (X + c2) = c1 - c2. */ + if (e2_op2-expr_type == EXPR_CONSTANT + gfc_dep_compare_expr (e1_op1, e2_op1) == 0) + { + mpz_sub (*result, e1_op2-value.integer, + e2_op2-value.integer); + return true; + } + /* Case 5: X + c1 - (c2 + X) = c1 - c2. */ + if (e2_op1-expr_type == EXPR_CONSTANT + gfc_dep_compare_expr (e1_op1, e2_op2) == 0) + { + mpz_sub (*result, e1_op2-value.integer, + e2_op1-value.integer); + return true; + } + } + else if (e1_op1-expr_type == EXPR_CONSTANT) + { + /* Case 6: c1 + X - (X + c2) = c1 - c2. */ + if (e2_op2-expr_type == EXPR_CONSTANT + gfc_dep_compare_expr (e1_op2, e2_op1) == 0) + { + mpz_sub (*result, e1_op1-value.integer, + e2_op2-value.integer); + return true; + } + /* Case 7: c1 + X - (c2 + X) = c1 - c2. */ + if (e2_op1-expr_type == EXPR_CONSTANT + gfc_dep_compare_expr (e1_op2, e2_op2) == 0) + { + mpz_sub (*result, e1_op1-value.integer, + e2_op1-value.integer); + return true; + } + } + } + + if (e2-expr_type == EXPR_OP e2-value.op.op == INTRINSIC_MINUS) + { + e2_op1 = discard_nops (e2-value.op.op1); + e2_op2 = discard_nops (e2-value.op.op2); + + if (e1_op2-expr_type == EXPR_CONSTANT) + { + /* Case 8: X + c1 - (X - c2) = c1 + c2. */ + if (e2_op2-expr_type == EXPR_CONSTANT + gfc_dep_compare_expr (e1_op1, e2_op1) == 0) + { + mpz_add
Re: [patch, mips] Patch to control the use of integer madd/msub instructions
On Sat, 2013-03-23 at 14:50 +, Richard Sandiford wrote: This is similar in spirit to -mbranch-likely. It'd be good for consistency if they were defined in a similar style. I think that means removing !TARGET_MIPS16 from ISA_HAS_MADD_MSUB and instead having: #define GENERATE_MADD_MSUB (TARGET_IMADD !TARGET_MIPS16) There would also be: #define PTF_AVOID_IMADD 0x2 which should be included in the 74k description, and a block similar to the MASK_BRANCHLIKELY one in mips_option_override. There needs to be documentation in invoke.texi. I can do it this way if you want, I was using -mllsc as my template for how to implement this. Do you think the -mllsc flag should be implemented in the same way as -mbranch-likely? But -- sorry for the soapbox speech -- it would be better to retune so that new options aren't needed. I'm assuming you're testing against the same microarchitecture that the original 74k authors were. If so, it seems like -mimadd is just an option for choosing between two bad implementations. One uses MADD and MSUB unconditionally (contrary to the experience of the original authors) and the other never uses it at all (contrary to your experience). That's not enough reason to reject the patch, just saying :-) I agree that the 74k should only be using the integer madd/msub instruction where it makes sense but I think having a flag to allow the user to override it is still a good thing because the compiler won't always be right. Actually, one of my reasons for adding this flag is to make it easier for me to do 74k runs with and without madd/msub and see where we are using (but shouldn't) and hopefully improve the current implementation. Steve Ellcey sell...@mips.com
Re: [patch] cilkplus array notation for C (clean, independent patchset, take 1)
On 03/22/13 17:03, Iyer, Balaji V wrote: I have not fixed all the issues below (the big one that is left is the bultin function representation that Joseph Pointed out). I have fixed most of the other issues. All the things I have fixed are marked by FIXED! Don't worry, I can work on the builtin function representation. I am keeping a list of pending issues on the wiki (http://gcc.gnu.org/wiki/cilkplus-merge) with my name in parenthesis for items I am working on. Particularly, I have added a sub-section for array notation items that have been pointed out in reviews but have not been completed. I suggest you keep this list up to date as well, so we don't loose track of what has been pointed out. diff --git a/gcc/c-family/ChangeLog.cilkplus b/gcc/c-family/ChangeLog.cilkplus index 6591fd1..10db29b 100644 --- a/gcc/c-family/ChangeLog.cilkplus +++ b/gcc/c-family/ChangeLog.cilkplus @@ -1,7 +1,11 @@ +2013-03-22 Balaji V. Iyer balaji.v.i...@intel.com + + * c-pretty-print.c (pp_c_expression): Added ARRAY_NOTATION_REF case. + 2013-03-20 Balaji V. Iyer balaji.v.i...@intel.com * c-common.c (c_define_builtins): When cilkplus is enabled, the You can combine changelog entries into one entry. This will make it easier when we merge into mainline. So basically, add the c-pretty-print.c entry to the entry below it. Non-static function declarations like this should not be inside a .c file. If these functions are used outside this file, there should be an associated header that declares them; include it in the .c file. If only used inside the .c file that defines them, make them static (and topologically sort static functions inside a source file so that forward static declarations are only needed for cases of recursion). +/* Mark the FNDECL as cold, meaning that the function specified by FNDECL is + not run as is. */ The cold attribute means unlikely to be executed rather than not run as is. Maybe not run as is is what's relevant here, but I'm not clear why this attribute would be useful for built-in functions at all - the documentation suggests it's only relevant when a user defines a function themselves, and affects the code generated for that function, so wouldn't be relevant at all for built-in functions. I see you fixed this. Since you are only fixing some of the items Joseph pointed out in this patch, please put FIXED below each item you did to aid in reviewing. +void +array_notation_init_builtins (void) Other built-in functions use various .def files (builtins.def and the files it includes) to avoid lots of repetitive code like this - can you integrate this with that mechanism? If you do so, then you should be able to avoid (or massively simplify) functions like: +/* Returns true if the function call specified in FUNC_NAME is + __sec_implicit_index. */ + +bool +is_sec_implicit_index_fn (tree func_name) because code can use the BUILT_IN_* enum values to test whether a particular function is in use - which is certainly cleaner than using strcmp against the function name. And here put FIXED if fixed, or Aldy is going to work on this or remove it altogether so it's not assumed that it was fixed by this patch since you're quoting it. +/* Returns the first and only argument for FN, which should be a + sec_implicit_index function. FN's location in the source file is is + indicated by LOCATION. */ + +int +extract_sec_implicit_index_arg (location_t location, tree fn) { + tree fn_arg; + HOST_WIDE_INT return_int = 0; + if (!fn) +return -1; Why the random check for a NULL argument? If a NULL argument is valid (meaning that it makes the code cleaner to allow such arguments rather than making sure the function isn't called with them), this should be documented in the comment above the function; otherwise, if such an argument isn't valid, there is no need to check for it. I always tend to check for a null pointer before I access the fields in the structure. In this case it is unnecessary. In some cases (e.g. find_rank) there is a good chance a null pointer will be passed into the function and we need to check that and reject those. I think what Joseph is suggesting is that if NULL is not valid, then the caller should check this. But if NULL is valid, then it should be documented in the function comment at the top. + if (TREE_CODE (fn) == CALL_EXPR) +{ + fn_arg = CALL_EXPR_ARG (fn, 0); + if (really_constant_p (fn_arg)) I don't think really_constant_p is what's wanted; http://software.intel.com/sites/default/files/m/4/e/7/3/1/40297- Intel_Cilk_plus_lang_spec_2.htm says The argument shall be an integer constant expression., and such expressions always appear in the C front end as INTEGER_CST. So you can just check for INTEGER_CST. What about C++? This function is shared by both C and C++. Same thing for C++, but... Now a subtlety here is that the function argument will have been folded by
Re: [patch, mips] Patch to control the use of integer madd/msub instructions
Steve Ellcey sell...@imgtec.com writes: On Sat, 2013-03-23 at 14:50 +, Richard Sandiford wrote: This is similar in spirit to -mbranch-likely. It'd be good for consistency if they were defined in a similar style. I think that means removing !TARGET_MIPS16 from ISA_HAS_MADD_MSUB and instead having: #define GENERATE_MADD_MSUB (TARGET_IMADD !TARGET_MIPS16) There would also be: #define PTF_AVOID_IMADD 0x2 which should be included in the 74k description, and a block similar to the MASK_BRANCHLIKELY one in mips_option_override. There needs to be documentation in invoke.texi. I can do it this way if you want, I was using -mllsc as my template for how to implement this. Do you think the -mllsc flag should be implemented in the same way as -mbranch-likely? -mllsc is a little different in that it can be used even when the ISA doesn't support it (thanks to kernel emulation). -mimadd isn't like that though: we only want to use MADD/MSUB if the ISA has it. So I think it makes sense to leave -mllsc as it is but do -mimadd in the same way as -mbranch-likely. Thanks, Richard
[patch, fortran] Use memcmp() for string comparison for constant-length kind=1 strings
Hello world, this patch uses memcpy() directly when comparing two kind=1 strings of equal and constant lengths. The test case modification depends on the previous patch at http://gcc.gnu.org/ml/gcc-patches/2013-03/msg00996.html for setting the string lengths for substrings. Regression-tested. No extra test case because the original test cases have to be modified to avoid failure, and test the new feature. OK for trunk after committing the patch above? 2013-03-25 Thomas Koenig tkoe...@gcc.gnu.org * trans-expr.c (build_memcmp_call): New function. (gfc_build_compare_string): If the kind=1 strings to be compared have constant and equal lengths, use memcmp(). 2013-03-25 Thomas Koenig tkoe...@gcc.gnu.org * gfortran.dg/character_comparison_3.f90: Adjust for use of memcmp for constant and equal string lengths. * gfortran.dg/character_comparison_5.f90: Likewise. Index: fortran/trans-expr.c === --- fortran/trans-expr.c (Revision 196748) +++ fortran/trans-expr.c (Arbeitskopie) @@ -2655,6 +2665,32 @@ gfc_optimize_len_trim (tree len, tree str, int kin return -1; } +/* Helper to build a call to memcmp. */ + +static tree +build_memcmp_call (tree s1, tree s2, tree n) +{ + tree tmp; + + if (!POINTER_TYPE_P (TREE_TYPE (s1))) +s1 = gfc_build_addr_expr (pvoid_type_node, s1); + else +s1 = fold_convert (pvoid_type_node, s1); + + if (!POINTER_TYPE_P (TREE_TYPE (s2))) +s2 = gfc_build_addr_expr (pvoid_type_node, s2); + else +s2 = fold_convert (pvoid_type_node, s2); + + n = fold_convert (size_type_node, n); + + tmp = build_call_expr_loc (input_location, + builtin_decl_explicit (BUILT_IN_MEMCMP), + 3, s1, s2, n); + + return fold_convert (integer_type_node, tmp); +} + /* Compare two strings. If they are all single characters, the result is the subtraction of them. Otherwise, we build a library call. */ @@ -2698,7 +2734,13 @@ gfc_build_compare_string (tree len1, tree str1, tr /* Build a call for the comparison. */ if (kind == 1) -fndecl = gfor_fndecl_compare_string; +{ + if (INTEGER_CST_P (len1) INTEGER_CST_P (len2) + tree_int_cst_equal (len1, len2)) + return build_memcmp_call (str1, str2, len1); + else + fndecl = gfor_fndecl_compare_string; +} else if (kind == 4) fndecl = gfor_fndecl_compare_string_char4; else Index: testsuite/gfortran.dg/character_comparison_3.f90 === --- testsuite/gfortran.dg/character_comparison_3.f90 (Revision 196748) +++ testsuite/gfortran.dg/character_comparison_3.f90 (Arbeitskopie) @@ -25,6 +25,7 @@ program main if (c(:k3) == c(:k44)) call abort end program main -! { dg-final { scan-tree-dump-times gfortran_compare_string 8 original } } +! { dg-final { scan-tree-dump-times gfortran_compare_string 6 original } } +! { dg-final { scan-tree-dump-times __builtin_memcmp 2 original } } ! { dg-final { cleanup-tree-dump original } } Index: testsuite/gfortran.dg/character_comparison_5.f90 === --- testsuite/gfortran.dg/character_comparison_5.f90 (Revision 196748) +++ testsuite/gfortran.dg/character_comparison_5.f90 (Arbeitskopie) @@ -16,6 +16,6 @@ program main end program main ! { dg-final { scan-tree-dump-times gfortran_concat_string 0 original } } -! { dg-final { scan-tree-dump-times gfortran_compare_string 2 original } } +! { dg-final { scan-tree-dump-times __builtin_memcmp 2 original } } ! { dg-final { cleanup-tree-dump original } }
Re: [patch] Unified debug dump function names.
On 3/25/13, Richard Biener richard.guent...@gmail.com wrote: You add a not used new interface. What for? So that people can use it. For use from gdb only? No, for use from both gdb and internally. It is often that folks add dumps in various places while developing/debugging. These functions support that effort without having to hunt down the name. In which case it should be debug (), not dump (). I will use whatever name you wish, but I would have preferred that we addressed naming issues when we published the plan, not after I've done the implementation. What name do you wish? -- Lawrence Crowl
Re: [patch] Unified debug dump function names.
On 3/25/13, Tom Tromey tro...@redhat.com wrote: Lawrence == Lawrence Crowl cr...@googlers.com writes: Lawrence This patch is somewhat different from the original plan at Lawrence gcc.gnu.org/wiki/cxx-conversion/debugging-dumps. The reason Lawrence is that gdb has an incomplete implementation of C++ call syntax; Lawrence requiring explicit specification of template arguments and Lawrence explicit specification of function arguments even when they have Lawrence default values. Note that the latter is because GCC doesn't emit this information. I'm not laying blame anywhere, just informing folks of an adjustment to the plan due to the current situation. As for the former ... we have a patch that works in some cases, but it's actually unclear to me how well the debugger can do in general here. We haven't put it in since it seems better to require users to be explicit than to silently do the wrong thing in some cases. My model is that I should be able to cut and paste an expression from the source to the debugger and have it work. I concede that C++ function overload resolution is a hard problem. However, gdb has a slightly easier task in that it won't be doing instantiation (as that expression has already instantiated everything it needs) and so it need only pick among what exists. -- Lawrence Crowl
Re: extend fwprop optimization
On Mon, Mar 25, 2013 at 2:35 AM, Richard Biener richard.guent...@gmail.com wrote: On Sun, Mar 24, 2013 at 5:18 AM, Wei Mi w...@google.com wrote: This is the patch to add the shift truncation in simplify_binary_operation_1. I add a new hook TARGET_SHIFT_COUNT_TRUNCATED which uses enum rtx_code to decide whether we can do shift truncation. I didn't use TARGET_SHIFT_TRUNCATION_MASK in simplify_binary_operation_1 because it uses the macro SHIFT_COUNT_TRUNCATED. If I change SHIFT_COUNT_TRUNCATED to targetm.shift_count_truncated in TARGET_SHIFT_TRUNCATION_MASK, I need to give TARGET_SHIFT_TRUNCATION_MASK a enum rtx_code param, which wasn't trivial to get at many places in existing code. patch.1 ~ patch.4 pass regression and bootstrap on x86_64-unknown-linux-gnu. Doing this might prove dangerous in case some pass may later decide to use an instruction that behaves in different ways. Consider tem = 1 (n 255); // count truncated x = y tem; // bittest instruction bit nr _not_ truncated so if tem is expanded to use a shift instruction which truncates the shift count the explicit and is dropped. If later combine comes around and combines the bit-test to use the bittest instruction which does not implicitely truncate the cound you have generated wrong-code. So it means the existing truncation pattern defined in insn split is also incorrect because the truncated shift may be combined into a bit test pattern? // The following define_insn_and_split will do shift truncation. (define_insn_and_split *shift_insnmode3_mask [(set (match_operand:SWI48 0 nonimmediate_operand =rm) (any_shiftrt:SWI48 (match_operand:SWI48 1 nonimmediate_operand 0) (subreg:QI (and:SI (match_operand:SI 2 nonimmediate_operand c) (match_operand:SI 3 const_int_operand n)) 0))) (clobber (reg:CC FLAGS_REG))] ix86_binary_operator_ok (CODE, MODEmode, operands) (INTVAL (operands[3]) (GET_MODE_BITSIZE (MODEmode)-1)) == GET_MODE_BITSIZE (MODEmode)-1 # 1 [(parallel [(set (match_dup 0) (any_shiftrt:SWI48 (match_dup 1) (match_dup 2))) (clobber (reg:CC FLAGS_REG))])] { if (can_create_pseudo_p ()) operands [2] = force_reg (SImode, operands[2]); operands[2] = simplify_gen_subreg (QImode, operands[2], SImode, 0); } [(set_attr type ishift) (set_attr mode MODE)]) So we need to make sure any explicit truncation originally in place is kept in the RTL - which means SHIFT_COUNT_TRUNCATED should not exist at all, but instead there would be two patterns for shifts with implicit truncation - one involving the truncation (canonicalized to bitwise and) and one not involving the truncation. Richard. I am trying to figure out a way not to lose the opportunity when shift truncation is not combined in a bit test pattern. Can we keep the explicit truncation in RTL, but generate truncation code in assembly? Then only shift truncation which not combined in a bit test pattershift truncationn will happen. (define_insn *shift_insn_andmode [(set (match_operand:SWI48 0 nonimmediate_operand =rm) (any_shiftrt:SWI48 (match_operand:SWI48 1 nonimmediate_operand 0) (subreg:QI (and:SI (match_operand:SI 2 nonimmediate_operand c) (match_operand:SI 3 const_int_operand n)) 0))) (clobber (reg:CC FLAGS_REG))] ix86_binary_operator_ok (CODE, MODEmode, operands) { if ((INTVAL (operands[3]) (GET_MODE_BITSIZE (MODEmode)-1)) == GET_MODE_BITSIZE (MODEmode)-1) return and\t{%3, %2|%2, %3}\n\r shift\t{%b2, %0|%0, %b2}; else shift\t{%2, %0|%0, %2}; } Thanks, Wei.
Re: extend fwprop optimization
I am trying to figure out a way not to lose the opportunity when shift truncation is not combined in a bit test pattern. Can we keep the explicit truncation in RTL, but generate truncation code in assembly? Then only shift truncation which not combined in a bit test pattershift truncationn will happen. (define_insn *shift_insn_andmode [(set (match_operand:SWI48 0 nonimmediate_operand =rm) (any_shiftrt:SWI48 (match_operand:SWI48 1 nonimmediate_operand 0) (subreg:QI (and:SI (match_operand:SI 2 nonimmediate_operand c) (match_operand:SI 3 const_int_operand n)) 0))) (clobber (reg:CC FLAGS_REG))] ix86_binary_operator_ok (CODE, MODEmode, operands) { if ((INTVAL (operands[3]) (GET_MODE_BITSIZE (MODEmode)-1)) == GET_MODE_BITSIZE (MODEmode)-1) return and\t{%3, %2|%2, %3}\n\r shift\t{%b2, %0|%0, %b2}; else shift\t{%2, %0|%0, %2}; } Sorry, rectify a mistake: { if ((INTVAL (operands[3]) (GET_MODE_BITSIZE (MODEmode)-1)) == GET_MODE_BITSIZE (MODEmode)-1) return shift\t{%2, %0|%0, %2}; else return and\t{%3, %2|%2, %3}\n\r shift\t{%b2, %0|%0, %b2}; } Thanks, Wei.
Re: [ARM] Fix ICE in minipool handling at -Os
On 23/03/13 11:20, Eric Botcazou wrote: We ran into an ICE at -Os on the 4.7 branch for ARM (BE/VFPv3/ARM): FAIL: gcc.c-torture/compile/920928-2.c -Os (internal compiler error) It's an assertion deep in the ARM back-end: /* If an insn doesn't have a range defined for it, then it isn't expecting to be reworked by this code. Better to stop now than to generate duff assembly code. */ gcc_assert (fix-forwards || fix-backwards); This happens for arm_zero_extendhisi2_v6, but I fail to see what is different for it from arm_extendhisi2_v6, which is expecting to be reworked. Hence the attached patch, which copies attributes from arm_extendhisi2_v6 to arm_zero_extendhisi2_v6. No regressions on ARM, OK for the mainline? 2013-03-23 Eric Botcazou ebotca...@adacore.com * config/arm/arm.md (arm_zero_extendhisi2): Add pool_range and neg_pool_range attributes. (arm_zero_extendhisi2_v6): Likewise. Having half-word accesses into the minipool is generally a bad idea. The limited offset range that's supported by these instructions means it's much more likely that we'll end up with a pool after a conditional branch or, worse, in the middle of a linear code sequence. That means we have to jump around the pool, which costs performance. We really need to find out why the compiler keeps trying to create these and fix that problem rather than work around the issue. I'm not sure why the v6 variants allow this; I thought that had been taken out. R. p.diff Index: config/arm/arm.md === --- config/arm/arm.md (revision 196816) +++ config/arm/arm.md (working copy) @@ -4650,7 +4650,9 @@ (define_insn *arm_zero_extendhisi2 # ldr%(h%)\\t%0, %1 [(set_attr type alu_shift,load_byte) - (set_attr predicable yes)] + (set_attr predicable yes) + (set_attr pool_range *,256) + (set_attr neg_pool_range *,244)] ) (define_insn *arm_zero_extendhisi2_v6 @@ -4660,8 +4662,10 @@ (define_insn *arm_zero_extendhisi2_v6 @ uxth%?\\t%0, %1 ldr%(h%)\\t%0, %1 - [(set_attr predicable yes) - (set_attr type simple_alu_shift,load_byte)] + [(set_attr type simple_alu_shift,load_byte) + (set_attr predicable yes) + (set_attr pool_range *,256) + (set_attr neg_pool_range *,244)] ) (define_insn *arm_zero_extendhisi2addsi
Re: [patch] Unified debug dump function names.
Lawrence == Lawrence Crowl cr...@googlers.com writes: Lawrence My model is that I should be able to cut and paste an expression Lawrence from the source to the debugger and have it work. I concede that Lawrence C++ function overload resolution is a hard problem. However, gdb Lawrence has a slightly easier task in that it won't be doing instantiation Lawrence (as that expression has already instantiated everything it needs) Lawrence and so it need only pick among what exists. Yeah, what isn't clear to me is that even this can be done in a behavior-preserving way, at least short of having full source available and the entire compiler in the debugger. I'd be very pleased to be wrong, but my current understanding is that one can play arbitrary games with SFINAE to come up with code that defeats any less complete solution. Sergio is going to look at this area again. So if you know differently, it would be great to have your input. I can dig up the current (pending -- but really unreviewed for a few years for the above reasons) gdb patch if you are interested. I believe it worked by applying overload-resolution-like rules to templates (though it has been a while). Tom
[PATCH 6/n, i386]: Merge *zero_extendsidi2_rex64 with base pattern using x64 and nox64 isa attribute
Hello! 2013-03-25 Uros Bizjak ubiz...@gmail.com * config/i386/i386.md (*zero_extendsidi2): Merge with *zero_extendsidi2_rex64. Use x64 and nox64 isa attributes. * config/i386/predicates.md (x86_64_zext_operand): Rename from x86_64_zext_general_operand. Use nonimmediate_operand on 32bit targets. Clarify comment. Tested on x86_64-pc-linux-gnu {,-m32} and committed to mainline SVN. Uros. Index: i386.md === --- i386.md (revision 197053) +++ i386.md (working copy) @@ -3135,13 +3135,13 @@ [(set (match_operand:DI 0 nonimmediate_operand) (zero_extend:DI (match_operand:SI 1 nonimmediate_operand)))]) -(define_insn *zero_extendsidi2_rex64 +(define_insn *zero_extendsidi2 [(set (match_operand:DI 0 nonimmediate_operand - =r ,o,?*Ym,?*y,?*Yi,?*x) + =r,?r,?o,r ,o,?*Ym,?*y,?*Yi,?*x) (zero_extend:DI -(match_operand:SI 1 x86_64_zext_general_operand - rmWz,0,r ,m ,r ,m)))] - TARGET_64BIT +(match_operand:SI 1 x86_64_zext_operand + 0 ,rm,r ,rmWz,0,r ,m ,r ,m)))] + { switch (get_attr_type (insn)) { @@ -3164,30 +3164,40 @@ gcc_unreachable (); } } - [(set_attr type imovx,multi,mmxmov,mmxmov,ssemov,ssemov) - (set_attr prefix orig,*,orig,orig,maybe_vex,maybe_vex) - (set_attr prefix_0f 0,*,*,*,*,*) - (set_attr mode SI,SI,DI,DI,TI,TI)]) + [(set (attr isa) + (cond [(eq_attr alternative 0,1,2) + (const_string nox64) + (eq_attr alternative 3) + (const_string x64) + (eq_attr alternative 8) + (const_string sse2) + ] + (const_string *))) + (set (attr type) + (cond [(eq_attr alternative 0,1,2,4) + (const_string multi) + (eq_attr alternative 5,6) + (const_string mmxmov) + (eq_attr alternative 7,8) + (const_string ssemov) + ] + (const_string imovx))) + (set (attr prefix) + (if_then_else (eq_attr type ssemov) + (const_string maybe_vex) + (const_string orig))) + (set (attr prefix_0f) + (if_then_else (eq_attr type imovx) + (const_string 0) + (const_string *))) + (set (attr mode) + (cond [(eq_attr alternative 5,6) + (const_string DI) + (eq_attr alternative 7,8) + (const_string TI) + ] + (const_string SI)))]) -(define_insn *zero_extendsidi2 - [(set (match_operand:DI 0 nonimmediate_operand - =ro,?r,?o,?*Ym,?*y,?*Yi,?*x) - (zero_extend:DI (match_operand:SI 1 nonimmediate_operand - 0 ,rm,r ,r ,m ,r ,m)))] - !TARGET_64BIT - @ - # - # - # - movd\t{%1, %0|%0, %1} - movd\t{%1, %0|%0, %1} - %vmovd\t{%1, %0|%0, %1} - %vmovd\t{%1, %0|%0, %1} - [(set_attr isa *,*,*,*,*,*,sse2) - (set_attr type multi,multi,multi,mmxmov,mmxmov,ssemov,ssemov) - (set_attr prefix *,*,*,orig,orig,maybe_vex,maybe_vex) - (set_attr mode SI,SI,SI,DI,DI,TI,TI)]) - (define_split [(set (match_operand:DI 0 memory_operand) (zero_extend:DI (match_operand:SI 1 memory_operand)))] Index: predicates.md === --- predicates.md (revision 197053) +++ predicates.md (working copy) @@ -311,15 +311,15 @@ (match_operand 0 x86_64_immediate_operand)) (match_operand 0 general_operand))) -;; Return true if OP is general operand representable on x86_64 -;; as zero extended constant. This predicate is used in zero-extending -;; conversion operations that require non-VOIDmode immediate operands. -(define_predicate x86_64_zext_general_operand +;; Return true if OP is representable on x86_64 as zero-extended operand. +;; This predicate is used in zero-extending conversion operations that +;; require non-VOIDmode immediate operands. +(define_predicate x86_64_zext_operand (if_then_else (match_test TARGET_64BIT) (ior (match_operand 0 nonimmediate_operand) (and (match_operand 0 x86_64_zext_immediate_operand) (match_test GET_MODE (op) != VOIDmode))) -(match_operand 0 general_operand))) +(match_operand 0 nonimmediate_operand))) ;; Return true if OP is general operand representable on x86_64 ;; as either sign extended or zero extended constant.
Re: GCC 4.8.0 does not compile for DJGPP
On Mon, Mar 25, 2013 at 11:02 AM, Fabrizio Gennari fabrizio...@tiscali.it wrote: Il 25/03/2013 00:00, Ian Lance Taylor ha scritto: On Sun, Mar 24, 2013 at 10:51 AM, Fabrizio Gennari fabrizio...@tiscali.it wrote: Il 24/03/2013 18:48, Fabrizio Gennari ha scritto: Il 23/03/2013 18:07, DJ Delorie ha scritto: The DJGPP build of gcc 4.8.0 was just uploaded, it might have some patches that haven't been committed upstream yet. Thank you DJ. I downloaded beta/v2gnu/gcc480s.zip from a mirror, and that compiles. And, indeed, the file gcc/config/i386/djgpp.h is different from the one in the official gcc-4.8.0.tar.bz2, meaning that some DJGPP patches are not present upstream. Forgot to say that I also had to apply this patch --- ../gcc-4.8.0/libbacktrace/alloc.c2013-01-14 19:17:30.0 +0100 +++ ../gcc-4.80/libbacktrace/alloc.c2013-03-24 18:07:11.995891959 +0100 @@ -34,6 +34,7 @@ #include errno.h #include stdlib.h +#include sys/types.h #include backtrace.h #include internal.h What failed without that patch? Ian libtool: compile: /home/fabrizio/dev/djgpp/cross/gcc2/./gcc/xgcc -B/home/fabrizio/dev/djgpp/cross/gcc2/./gcc/ -B/home/fabrizio/dev/djgpp/i586-pc-msdosdjgpp/bin/ -B/home/fabrizio/dev/djgpp/i586-pc-msdosdjgpp/lib/ -isystem /home/fabrizio/dev/djgpp/i586-pc-msdosdjgpp/include -isystem /home/fabrizio/dev/djgpp/i586-pc-msdosdjgpp/sys-include -DHAVE_CONFIG_H -I. -I../../../gcc-4.80/libbacktrace -I ../../../gcc-4.80/libbacktrace/../include -I ../../../gcc-4.80/libbacktrace/../libgcc -I ../libgcc -funwind-tables -frandom-seed=alloc.lo -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -Wmissing-format-attribute -Wcast-qual -Werror -fPIC -g -O2 -c ../../../gcc-4.80/libbacktrace/alloc.c -o alloc.o -fPIC ignored (not supported for DJGPP) In file included from ../../../gcc-4.80/libbacktrace/alloc.c:39:0: ../../../gcc-4.80/libbacktrace/internal.h:141:11: error: unknown type name ‘off_t’ off_t offset, size_t size, ^ make[3]: *** [alloc.lo] Errore 1 make[3]: uscita dalla directory /home/fabrizio/dev/djgpp/cross/gcc2/i586-pc-msdosdjgpp/libbacktrace internal.h (included by libbacktrace/alloc.c) uses off_t, which is not declared unless sys/types.h is included Thanks. I committed the following patch to mainline and 4.8 branch. Bootstrapped and tested on x86_64-unknown-linux-gnu. Ian 2013-03-25 Ian Lance Taylor i...@google.com * alloc.c: #include sys/types.h. * mmap.c: Likewise. foo.patch Description: Binary data
Re: [testsuite] Cap VLEN in gcc.c-torture/execute/20011008-3.c
On Mar 25, 2013, at 4:27 AM, Eric Botcazou ebotca...@adacore.com wrote: gcc.c-torture/execute/20011008-3.c has these lines: #ifndef STACK_SIZE #define VLEN1235 #else #define VLEN (STACK_SIZE/10) #endif which means that VLEN is _not_ capped if STACK_SIZE is defined, which goes against the very purpose of STACK_SIZE in the testing framework. Fixed thusly, tested on x86_64-suse-linux, OK for the mainline? Ok. 2013-03-25 Eric Botcazou ebotca...@adacore.com * gcc.c-torture/execute/20011008-3.c: Cap VLEN with STACK_SIZE too.
[PATCH] Fix -Wformat-security warning in arm.c
This fixes a gratuitous warning. Thanks, Roland gcc/ 2013-03-25 Roland McGrath mcgra...@google.com * config/arm/arm.c (arm_print_operand: case 'w'): Use fputs rather than fprintf with a non-constant, non-format string. --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -17997,7 +17997,7 @@ arm_print_operand (FILE *stream, rtx x, int code) wC12, wC13, wC14, wC15 }; - fprintf (stream, wc_reg_names [INTVAL (x)]); + fputs (wc_reg_names [INTVAL (x)], stream); } return;
[C++ Patch] PR 56722
Hi, avoid a Seg fault on invalid by preliminarily checking DECL_LANG_SPECIFIC. The resulting error message is very similar to clang's. Tested x86_64-linux. Ok mainline and branch? Thanks, Paolo. /cp 2013-03-25 Paolo Carlini paolo.carl...@oracle.com PR c++/56722 * decl.c (cp_finish_decl): Check DECL_LANG_SPECIFIC before DECL_TEMPLATE_INSTANTIATION. /testsuite 2013-03-25 Paolo Carlini paolo.carl...@oracle.com PR c++/56722 * g++.dg/cpp0x/range-for23.C: New. Index: cp/decl.c === --- cp/decl.c (revision 197053) +++ cp/decl.c (working copy) @@ -6111,7 +6111,8 @@ cp_finish_decl (tree decl, tree init, bool init_co tree d_init; if (init == NULL_TREE) { - if (DECL_TEMPLATE_INSTANTIATION (decl) + if (DECL_LANG_SPECIFIC (decl) + DECL_TEMPLATE_INSTANTIATION (decl) !DECL_TEMPLATE_INSTANTIATED (decl)) { /* init is null because we're deferring instantiating the Index: testsuite/g++.dg/cpp0x/range-for23.C === --- testsuite/g++.dg/cpp0x/range-for23.C(revision 0) +++ testsuite/g++.dg/cpp0x/range-for23.C(working copy) @@ -0,0 +1,8 @@ +// PR c++/56722 +// { dg-do compile { target c++11 } } + +int main() +{ + for (const auto i, 21) // { dg-error has no initializer|expected } +i; +}
Re: Record missing equivalence
On 03/21/2013 03:44 AM, Richard Biener wrote: + /* If LHS is an SSA_NAME and RHS is a constant and LHS was set +via a widening type conversion, then we may be able to record +additional equivalences. */ + if (lhs + TREE_CODE (lhs) == SSA_NAME + is_gimple_constant (rhs)) + { + gimple defstmt = SSA_NAME_DEF_STMT (lhs); + + if (defstmt + is_gimple_assign (defstmt) + CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (defstmt))) + { + tree old_rhs = gimple_assign_rhs1 (defstmt); + tree newval = fold_convert (TREE_TYPE (old_rhs), rhs); You want to delay that folding and creating of a new tree node until after ... + + /* If this was a widening conversion and if RHS is converted +to the type of OLD_RHS and has the same value, then we +can record an equivalence between OLD_RHS and the +converted representation of RHS. */ + if ((TYPE_PRECISION (TREE_TYPE (lhs)) + TYPE_PRECISION (TREE_TYPE (old_rhs))) ... this check. + operand_equal_p (rhs, newval, 0)) If you'd restricted yourself to handling INTEGER_CSTs then using int_fits_type_p (rhs, TREE_TYPE (lhs)) would have been enough to check. And operand_equal_p will never return for non-equal typed non-INTEGER_CSTs anyway ... Agreed. Addressed via the attached patch which was committed after a bootstrap and regression test on x86_64-unknown-linux-gnu. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 9bdf1e5..9db0629 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,9 @@ +2013-03-25 Jeff Law l...@redhat.com + + * tree-ssa-dom.c (record_equivalences_from_incoming_edge): Rework + slightly to avoid creating and folding useless trees. Simplify + slightly by restricting to INTEGER_CSTs and using int_fits_type_p. + 2013-03-25 Uros Bizjak ubiz...@gmail.com * config/i386/i386.md (*zero_extendsidi2): Merge with diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c index 57b814c..a71c6dc 100644 --- a/gcc/tree-ssa-dom.c +++ b/gcc/tree-ssa-dom.c @@ -1135,12 +1135,13 @@ record_equivalences_from_incoming_edge (basic_block bb) if (lhs) record_equality (lhs, rhs); - /* If LHS is an SSA_NAME and RHS is a constant and LHS was set -via a widening type conversion, then we may be able to record + /* If LHS is an SSA_NAME and RHS is a constant integer and LHS was +set via a widening type conversion, then we may be able to record additional equivalences. */ if (lhs TREE_CODE (lhs) == SSA_NAME - is_gimple_constant (rhs)) + is_gimple_constant (rhs) + TREE_CODE (rhs) == INTEGER_CST) { gimple defstmt = SSA_NAME_DEF_STMT (lhs); @@ -1149,16 +1150,14 @@ record_equivalences_from_incoming_edge (basic_block bb) CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (defstmt))) { tree old_rhs = gimple_assign_rhs1 (defstmt); - tree newval = fold_convert (TREE_TYPE (old_rhs), rhs); - - /* If this was a widening conversion and if RHS is converted -to the type of OLD_RHS and has the same value, then we -can record an equivalence between OLD_RHS and the -converted representation of RHS. */ - if ((TYPE_PRECISION (TREE_TYPE (lhs)) - TYPE_PRECISION (TREE_TYPE (old_rhs))) - operand_equal_p (rhs, newval, 0)) - record_equality (old_rhs, newval); + + /* If the constant is in the range of the type of OLD_RHS, +then convert the constant and record the equivalence. */ + if (int_fits_type_p (rhs, TREE_TYPE (old_rhs))) + { + tree newval = fold_convert (TREE_TYPE (old_rhs), rhs); + record_equality (old_rhs, newval); + } } }
Re: [C++ Patch] PR 56722
OK. Jason
C++ PATCH for c++/52014 (decltype and members of the enclosing class of a lambda)
We were getting confused trying to capture this for this-foo_ in the decltype. But we shouldn't capture anything just because it's mentioned in decltype. Tested x86_64-pc-linux-gnu, applying to trunk and 4.8. commit 580a948e95a571add5ea02b18f996cac227dfa77 Author: Jason Merrill ja...@redhat.com Date: Sun Mar 24 06:15:22 2013 -0400 PR c++/52014 * semantics.c (lambda_expr_this_capture): Don't capture 'this' in unevaluated context. diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c index e3aeb81..fb38e8d 100644 --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -9454,6 +9454,11 @@ lambda_expr_this_capture (tree lambda) tree this_capture = LAMBDA_EXPR_THIS_CAPTURE (lambda); + /* In unevaluated context this isn't an odr-use, so just return the + nearest 'this'. */ + if (cp_unevaluated_operand) +return lookup_name (this_identifier); + /* Try to default capture 'this' if we can. */ if (!this_capture LAMBDA_EXPR_DEFAULT_CAPTURE_MODE (lambda) != CPLD_NONE) @@ -9523,11 +9528,6 @@ lambda_expr_this_capture (tree lambda) if (!this_capture) { - /* In unevaluated context this isn't an odr-use, so just return the - nearest 'this'. */ - if (cp_unevaluated_operand) - return lookup_name (this_identifier); - error (%this% was not captured for this lambda function); result = error_mark_node; } diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-this14.C b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-this14.C new file mode 100644 index 000..9834bfd --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-this14.C @@ -0,0 +1,49 @@ +// PR c++/52014 +// { dg-require-effective-target c++11 } + +template class Iterator, class Func +void for_each(const Iterator first, const Iterator last, Func func) +{ + for (Iterator it = first; it != last; ++it) { +func(*it); + } +} + +template class T +struct helper +{ + typedef typename T::size_type type; +}; + +template class T +struct helperT +{ + typedef typename T::size_type type; +}; + +template class T +struct helperT* +{ + typedef typename T::size_type type; +}; + +struct bar +{ + struct foo + { +typedef int size_type; + } foo_; + + void test() + { +int arr[] = { 1, 2, 3 }; +for_each(arr, arr + 3, [](helperfoo::type i) { + for_each(arr, arr + 3, [](helperdecltype(foo_)::type j) { }); + }); + } +}; + +int main() +{ + return 0; +}
Re: [patch, mips] Patch to control the use of integer madd/msub instructions
On Mon, 2013-03-25 at 16:45 +, Richard Sandiford wrote: -mllsc is a little different in that it can be used even when the ISA doesn't support it (thanks to kernel emulation). -mimadd isn't like that though: we only want to use MADD/MSUB if the ISA has it. So I think it makes sense to leave -mllsc as it is but do -mimadd in the same way as -mbranch-likely. Thanks, Richard OK, Here is a patch that implements -mimadd in the same manner as -mbranch-likely. Steve Ellcey sell...@imgtec.com 2013-03-25 Steve Ellcey sell...@mips.com * config/mips/mmips-cpus.def (74kc, 74kf2_1, 74kf, 74kf, 74kf1_1, 74kfx, 74kx, 74kf3_2): Add PTF_AVOID_IMADD. * config/mips/mips.c (mips_option_override): Set IMADD default. * config/mips/mips.h (PTF_AVOID_IMADD): New. (ISA_HAS_MADD_MSUB): Remove MIPS16 check. (GENERATE_MADD_MSUB): Remove TUNE_74K check, add MIPS16 check. * config/mips/mips.md (mimadd): New flag for integer madd/msub. diff --git a/gcc/config/mips/mips-cpus.def b/gcc/config/mips/mips-cpus.def index 93c305a..c920c73 100644 --- a/gcc/config/mips/mips-cpus.def +++ b/gcc/config/mips/mips-cpus.def @@ -119,13 +119,13 @@ MIPS_CPU (34kfx, PROCESSOR_24KF1_1, 33, 0) MIPS_CPU (34kx, PROCESSOR_24KF1_1, 33, 0) MIPS_CPU (34kn, PROCESSOR_24KC, 33, 0) /* 34K with MT but no DSP. */ -MIPS_CPU (74kc, PROCESSOR_74KC, 33, 0) /* 74K with DSPr2. */ -MIPS_CPU (74kf2_1, PROCESSOR_74KF2_1, 33, 0) -MIPS_CPU (74kf, PROCESSOR_74KF2_1, 33, 0) -MIPS_CPU (74kf1_1, PROCESSOR_74KF1_1, 33, 0) -MIPS_CPU (74kfx, PROCESSOR_74KF1_1, 33, 0) -MIPS_CPU (74kx, PROCESSOR_74KF1_1, 33, 0) -MIPS_CPU (74kf3_2, PROCESSOR_74KF3_2, 33, 0) +MIPS_CPU (74kc, PROCESSOR_74KC, 33, PTF_AVOID_IMADD) /* 74K with DSPr2. */ +MIPS_CPU (74kf2_1, PROCESSOR_74KF2_1, 33, PTF_AVOID_IMADD) +MIPS_CPU (74kf, PROCESSOR_74KF2_1, 33, PTF_AVOID_IMADD) +MIPS_CPU (74kf1_1, PROCESSOR_74KF1_1, 33, PTF_AVOID_IMADD) +MIPS_CPU (74kfx, PROCESSOR_74KF1_1, 33, PTF_AVOID_IMADD) +MIPS_CPU (74kx, PROCESSOR_74KF1_1, 33, PTF_AVOID_IMADD) +MIPS_CPU (74kf3_2, PROCESSOR_74KF3_2, 33, PTF_AVOID_IMADD) MIPS_CPU (1004kc, PROCESSOR_24KC, 33, 0) /* 1004K with MT/DSP. */ MIPS_CPU (1004kf2_1, PROCESSOR_24KF2_1, 33, 0) diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index 252e828..0aaf4c6 100644 --- a/gcc/config/mips/mips.c +++ b/gcc/config/mips/mips.c @@ -16607,6 +16607,21 @@ mips_option_override (void) warning (0, the %qs architecture does not support branch-likely instructions, mips_arch_info-name); + /* If the user hasn't specified -mimadd or -mno-imadd set + MASK_IMADD based on the target architecture and tuning + flags. */ + if ((target_flags_explicit MASK_IMADD) == 0) +{ + if (ISA_HAS_MADD_MSUB + (mips_tune_info-tune_flags PTF_AVOID_IMADD) == 0) + target_flags |= MASK_IMADD; + else + target_flags = ~MASK_IMADD; +} + else if (TARGET_IMADD !ISA_HAS_MADD_MSUB) +warning (0, the %qs architecture does not support madd or msub + instructions, mips_arch_info-name); + /* The effect of -mabicalls isn't defined for the EABI. */ if (mips_abi == ABI_EABI TARGET_ABICALLS) { diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h index 0acce14..534ea26 100644 --- a/gcc/config/mips/mips.h +++ b/gcc/config/mips/mips.h @@ -42,13 +42,17 @@ extern int target_flags_explicit; #define ABI_EABI 3 #define ABI_O64 4 -/* Masks that affect tuning. - - PTF_AVOID_BRANCHLIKELY - Set if it is usually not profitable to use branch-likely instructions - for this target, typically because the branches are always predicted - taken and so incur a large overhead when not taken. */ -#define PTF_AVOID_BRANCHLIKELY 0x1 +/* Masks that affect tuning. */ + +/* Set PTF_AVOID_BRANCHLIKELY if is usually not profitable to use + branch-likely instructions for this target, typically because + the branches are always predicted taken and so incur a large + overhead when not taken. */ +#define PTF_AVOID_BRANCHLIKELY 0x1 +/* Set PTF_AVOID_IMADD if it is usually not profitable to use the + integer madd or msub instructions because of the overhead of + getting the result out of the HI/LO registers. */ +#define PTF_AVOID_IMADD0x2 /* Information about one recognized processor. Defined here for the benefit of TARGET_CPU_CPP_BUILTINS. */ @@ -868,14 +872,13 @@ struct mips_cpu_info { !TARGET_MIPS16) /* ISA has integer multiply-accumulate instructions, madd and msub. */ -#define ISA_HAS_MADD_MSUB ((ISA_MIPS32\ - || ISA_MIPS32R2 \ - || ISA_MIPS64 \ - || ISA_MIPS64R2) \ - !TARGET_MIPS16) +#define ISA_HAS_MADD_MSUB (ISA_MIPS32
Re: Small C++ PATCH to lookup_base
On 03/16/2013 09:36 PM, Jason Merrill wrote: This function ought to handle null T. I didn't think any of my other patches required this, but apparently I was wrong; this fixes 56692, so I'm applying it to 4.8 as well.
Re: [google][4.7]Using CPU mocks to test code coverage of multiversioned functions
Hi, On Mon, Mar 18, 2013 at 10:44 PM, Alan Modra amo...@gmail.com wrote: On Mon, Mar 18, 2013 at 06:18:58PM +0100, Richard Biener wrote: I was asking for the ifunc selector to be Overridable by ld_preload or a similar mechanism at dynamic load time. Please don't. Calling an ifunc resolver function in another library is just asking for trouble with current glibc. Why? Well, the other library containing the resolver function may not have had any dynamic relocations applied. So if the resolver makes use of the GOT (to read some variable), it will use unrelocated addresses. You'll segfault if you're lucky. Does this also mean that Paul's idea of doing: LD_CPU_FEATURES=sse,sse2 ./a.out # run as if only sse and sse2 are available is fraught with risk when used with IFUNC, particularly on x86_64? Shouldn't the IFUNC resolver go through the GOT even in this case. This could work well for the MV testing problem I explained earlier, but if this is not feasible with IFUNC in play I would like my original proposal reconsidered. Thanks Sri For anyone playing with ifunc, please test out your great ideas on i386, ppc32, mips, arm, etc. *NOT* x86_64 or powerpc64 which both avoid the GOT in many cases. -- Alan Modra Australia Development Lab, IBM
Re: [patch] cilkplus array notation for C (clean, independent patchset, take 1)
The specification doesn't seem very clear on to what extent the __sec_* operations must act like functions (what happens if someone puts parentheses around the __sec_* name, for example - that wouldn't work with the keyword approach). So the specification should be clarified there, but I think saying the __sec_* operations are syntactically special, like keywords, is more appropriate than requiring other uses to work. + return_int = (int) int_cst_value (fn_arg); + else + { + if (location == UNKNOWN_LOCATION EXPR_HAS_LOCATION (fn)) + location = EXPR_LOCATION (fn); + error_at (location, __sec_implicit_index parameter must be a + constant integer expression); The term is integer constant expression not constant integer expression. FIXED! ...it looks like you're going to have to rework all this as a keyword. OK, CAN I LOOK AT THIS AFTER WE FINISH THE BUILTIN FUNCTION IMPLEMENTATION FIX? Yes. Thank you for fixing everything I pointed out. Let's now wait for Joseph to give the final ok. There are some things he suggested, that I didn't look at at all, so I am deferring to him. Thanks again.
Re: [patch, mips] Patch to control the use of integer madd/msub instructions
Steve Ellcey sell...@imgtec.com writes: On Mon, 2013-03-25 at 16:45 +, Richard Sandiford wrote: -mllsc is a little different in that it can be used even when the ISA doesn't support it (thanks to kernel emulation). -mimadd isn't like that though: we only want to use MADD/MSUB if the ISA has it. So I think it makes sense to leave -mllsc as it is but do -mimadd in the same way as -mbranch-likely. Thanks, Richard OK, Here is a patch that implements -mimadd in the same manner as -mbranch-likely. It still needs the invoke.texi documentation. :-) Looks good otherwise, just a very small nit: -/* Masks that affect tuning. - - PTF_AVOID_BRANCHLIKELY - Set if it is usually not profitable to use branch-likely instructions - for this target, typically because the branches are always predicted - taken and so incur a large overhead when not taken. */ -#define PTF_AVOID_BRANCHLIKELY 0x1 +/* Masks that affect tuning. */ + +/* Set PTF_AVOID_BRANCHLIKELY if is usually not profitable to use + branch-likely instructions for this target, typically because + the branches are always predicted taken and so incur a large + overhead when not taken. */ +#define PTF_AVOID_BRANCHLIKELY 0x1 +/* Set PTF_AVOID_IMADD if it is usually not profitable to use the + integer madd or msub instructions because of the overhead of + getting the result out of the HI/LO registers. */ +#define PTF_AVOID_IMADD 0x2 It wasn't obvious with just one PTK_*, but the idea was to lay this out in the same way as the mips-protos.h enums. I.e.: /* Masks that affect tuning. PTF_AVOID_BRANCHLIKELY Set if it is usually not profitable to use branch-likely instructions for this target, typically because the branches are always predicted taken and so incur a large overhead when not taken. PTF_AVOID_IMADD Set if it is usually not profitable to use the integer MADD or MSUB instructions because of the overhead of getting the result out of the HI/LO registers. */ #define PTF_AVOID_BRANCHLIKELY 0x1 #define PTF_AVOID_IMADD 0x2 That's trivial enough not to need a retest, but please post the invoke.texi patch. Thanks, Richard
C++ PATCH for c++/56699 (ICE with sizeof member of unrelated class in lambda)
My maybe_resolve_dummy patch failed to consider that 'this' might not be relevant to the type of the dummy object. Tested x86_64-pc-linux-gnu, applying to trunk and 4.8. commit 5ac81af97e551d65ad46443973d337d388f35297 Author: Jason Merrill ja...@redhat.com Date: Mon Mar 25 17:14:04 2013 -0400 PR c++/56699 * semantics.c (maybe_resolve_dummy): Make sure that the enclosing class is derived from the type of the object. diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c index fb38e8d..127e2da 100644 --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -9565,7 +9565,8 @@ maybe_resolve_dummy (tree object) if (type != current_class_type current_class_type - LAMBDA_TYPE_P (current_class_type)) + LAMBDA_TYPE_P (current_class_type) + DERIVED_FROM_P (type, current_nonlambda_class_type ())) { /* In a lambda, need to go through 'this' capture. */ tree lam = CLASSTYPE_LAMBDA_EXPR (current_class_type); diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-this16.C b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-this16.C new file mode 100644 index 000..736d5f5 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-this16.C @@ -0,0 +1,28 @@ +// PR c++/56699 +// { dg-require-effective-target c++11 } + +struct A +{ +int a; +}; + +struct T +{ +int x; + +T() : x([]{ +sizeof(::A::a); +return 0; +}()) +{} +}; + +struct B +{ +int a; +}; + +void f() +{ +[]{sizeof(B::a);}; +}
[SH, 4.7, committed] Backport fix for m2a-single-only multilib
Hi, I've backported this one http://gcc.gnu.org/ml/gcc-patches/2012-03/msg01970.html to the 4.7 branch. Cheers, Oleg gcc/ChangeLog: Backport from mainline: 2012-04-03 Kaz Kojima kkoj...@gcc.gnu.org * config/sh/t-sh (MULTILIB_MATCHES): Match m2a-single-only to m2a-single instead of m2e. Index: gcc/config/sh/t-sh === --- gcc/config/sh/t-sh (revision 197065) +++ gcc/config/sh/t-sh (working copy) @@ -37,7 +37,7 @@ for abi in m1,m2,m3,m4-nofpu,m4-100-nofpu,m4-200-nofpu,m4-400,m4-500,m4-340,m4-300-nofpu,m4al,m4a-nofpu \ m1,m2,m2a-nofpu \ m2e,m3e,m4-single-only,m4-100-single-only,m4-200-single-only,m4-300-single-only,m4a-single-only \ - m2e,m2a-single-only \ + m2a-single,m2a-single-only \ m4-single,m4-100-single,m4-200-single,m4-300-single,m4a-single \ m4,m4-100,m4-200,m4-300,m4a \ m5-32media,m5-compact,m5-32media \
Re: [SH] PR 49880 - Fix some more -mdiv option issues
On Wed, 2013-03-13 at 12:05 +0900, Kaz Kojima wrote: Oleg Endo oleg.e...@t-online.de wrote: The attached patch should make the -mdiv= option work as it is described in the documentation (which I updated recently as part of PR 56529). Tested with 'make all' and make -k check-gcc RUNTESTFLAGS=sh.exp=pr49880* --target_board=sh-sim \{-m2,-m2a,-m2a-nofpu,-m2a-single,-m2a-single-only,-m3,-m3e,-m4,-m4-single, -m4-single-only,-m4a,-m4a-single,-m4a-single-only} OK for 4.8 and 4.7? OK. I've committed the attached patch including the sh-linux build fixes to the 4.7 branch as revision 197071. Cheers, Oleg Index: libgcc/config/sh/lib1funcs.S === --- libgcc/config/sh/lib1funcs.S (revision 196758) +++ libgcc/config/sh/lib1funcs.S (working copy) @@ -973,7 +973,7 @@ #ifdef L_sdivsi3_i4 .title SH DIVIDE !! 4 byte integer Divide code for the Renesas SH -#ifdef __SH4__ +#if defined (__SH4__) || defined (__SH2A__) !! args in r4 and r5, result in fpul, clobber dr0, dr2 .global GLOBAL(sdivsi3_i4) @@ -988,7 +988,7 @@ ftrc dr0,fpul ENDFUNC(GLOBAL(sdivsi3_i4)) -#elif defined(__SH4_SINGLE__) || defined(__SH4_SINGLE_ONLY__) || (defined (__SH5__) ! defined __SH4_NOFPU__) +#elif defined (__SH2A_SINGLE__) || defined (__SH2A_SINGLE_ONLY__) || defined(__SH4_SINGLE__) || defined(__SH4_SINGLE_ONLY__) || (defined (__SH5__) ! defined __SH4_NOFPU__) !! args in r4 and r5, result in fpul, clobber r2, dr0, dr2 #if ! __SH5__ || __SH5__ == 32 @@ -1013,13 +1013,12 @@ ENDFUNC(GLOBAL(sdivsi3_i4)) #endif /* ! __SH5__ || __SH5__ == 32 */ -#endif /* ! __SH4__ */ +#endif /* ! __SH4__ || __SH2A__ */ #endif #ifdef L_sdivsi3 /* __SH4_SINGLE_ONLY__ keeps this part for link compatibility with sh2e/sh3e code. */ -#if (! defined(__SH4__) ! defined (__SH4_SINGLE__)) || defined (__linux__) !! !! Steve Chamberlain !! s...@cygnus.com @@ -1336,13 +1335,12 @@ ENDFUNC(GLOBAL(sdivsi3)) #endif /* ! __SHMEDIA__ */ -#endif /* ! __SH4__ */ #endif #ifdef L_udivsi3_i4 .title SH DIVIDE !! 4 byte integer Divide code for the Renesas SH -#ifdef __SH4__ +#if defined (__SH4__) || defined (__SH2A__) !! args in r4 and r5, result in fpul, clobber r0, r1, r4, r5, dr0, dr2, dr4, !! and t bit @@ -1384,7 +1382,7 @@ .double 2147483648 ENDFUNC(GLOBAL(udivsi3_i4)) -#elif defined (__SH5__) ! defined (__SH4_NOFPU__) +#elif defined (__SH5__) ! defined (__SH4_NOFPU__) ! defined (__SH2A_NOFPU__) #if ! __SH5__ || __SH5__ == 32 !! args in r4 and r5, result in fpul, clobber r20, r21, dr0, fr33 .mode SHmedia @@ -1405,7 +1403,7 @@ ENDFUNC(GLOBAL(udivsi3_i4)) #endif /* ! __SH5__ || __SH5__ == 32 */ -#elif defined(__SH4_SINGLE__) || defined(__SH4_SINGLE_ONLY__) +#elif defined (__SH2A_SINGLE__) || defined (__SH2A_SINGLE_ONLY__) || defined(__SH4_SINGLE__) || defined(__SH4_SINGLE_ONLY__) !! args in r4 and r5, result in fpul, clobber r0, r1, r4, r5, dr0, dr2, dr4 .global GLOBAL(udivsi3_i4) @@ -1460,7 +1458,6 @@ #ifdef L_udivsi3 /* __SH4_SINGLE_ONLY__ keeps this part for link compatibility with sh2e/sh3e code. */ -#if (! defined(__SH4__) ! defined (__SH4_SINGLE__)) || defined (__linux__) !! args in r4 and r5, result in r0, clobbers r4, pr, and t bit .global GLOBAL(udivsi3) @@ -1655,7 +1652,6 @@ ENDFUNC(GLOBAL(udivsi3)) #endif /* ! __SHMEDIA__ */ -#endif /* __SH4__ */ #endif /* L_udivsi3 */ #ifdef L_udivdi3 Index: gcc/testsuite/gcc.target/sh/pr49880-1.c === --- gcc/testsuite/gcc.target/sh/pr49880-1.c (revision 0) +++ gcc/testsuite/gcc.target/sh/pr49880-1.c (revision 0) @@ -0,0 +1,22 @@ +/* Check that the option -mdiv=call-div1 works. */ +/* { dg-do link { target sh*-*-* } } */ +/* { dg-options -mdiv=call-div1 } */ +/* { dg-skip-if { sh*-*-* } { -m5*} { } } */ + +int +test00 (int a, int b) +{ + return a / b; +} + +unsigned int +test01 (unsigned int a, unsigned b) +{ + return a / b; +} + +int +main (int argc, char** argv) +{ + return test00 (argc, 123) + test01 (argc, 123); +} Index: gcc/testsuite/gcc.target/sh/pr49880-2.c === --- gcc/testsuite/gcc.target/sh/pr49880-2.c (revision 0) +++ gcc/testsuite/gcc.target/sh/pr49880-2.c (revision 0) @@ -0,0 +1,22 @@ +/* Check that the option -mdiv=call-fp works. */ +/* { dg-do link { target sh*-*-* } } */ +/* { dg-options -mdiv=call-fp } */ +/* { dg-skip-if { sh*-*-* } { -m5*} { } } */ + +int +test00 (int a, int b) +{ + return a / b; +} + +unsigned int +test01 (unsigned int a, unsigned b) +{ + return a / b; +} + +int +main (int argc, char** argv) +{ + return test00 (argc, 123) + test01 (argc, 123); +} Index: gcc/testsuite/gcc.target/sh/pr49880-3.c === --- gcc/testsuite/gcc.target/sh/pr49880-3.c (revision 0) +++ gcc/testsuite/gcc.target/sh/pr49880-3.c (revision 0) @@ -0,0
Re: [patch, mips] Patch to control the use of integer madd/msub instructions
Steve Ellcey sell...@imgtec.com writes: * config/mips/mmips-cpus.def (74kc, 74kf2_1, 74kf, 74kf, 74kf1_1, 74kfx, 74kx, 74kf3_2): Add PTF_AVOID_IMADD. * config/mips/mips.c (mips_option_override): Set IMADD default. * config/mips/mips.h (PTF_AVOID_IMADD): New. (ISA_HAS_MADD_MSUB): Remove MIPS16 check. (GENERATE_MADD_MSUB): Remove TUNE_74K check, add MIPS16 check. * config/mips/mips.md (mimadd): New flag for integer madd/msub. * doc/invoke.texi (-mimadd/-mno-imadd): New. OK, thanks. Richard
[wwwdocs] Mention fixed SH -mdiv option for 4.8 and 4.7
Hello, This one mentions the fixed SH -mdiv option in the changes for 4.8 and 4.7. OK? Cheers, Oleg ? www_sh_mdiv.patch Index: htdocs/gcc-4.7/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/changes.html,v retrieving revision 1.135 diff -u -r1.135 changes.html --- htdocs/gcc-4.7/changes.html 7 Jan 2013 22:39:43 - 1.135 +++ htdocs/gcc-4.7/changes.html 25 Mar 2013 23:13:53 - @@ -924,6 +924,8 @@ liDynamic shift instructions on SH2A./li liInteger absolute value calculations./li /ul/li +liThe code-mdiv=/code option for targets other than SHmedia has been +fixed and documented./li /ul h3SPARC/h3 Index: htdocs/gcc-4.8/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/changes.html,v retrieving revision 1.112 diff -u -r1.112 changes.html --- htdocs/gcc-4.8/changes.html 25 Mar 2013 08:36:45 - 1.112 +++ htdocs/gcc-4.8/changes.html 25 Mar 2013 23:13:53 - @@ -802,6 +802,9 @@ based displacement address modes. /li +liThe code-mdiv=/code option for targets other than SHmedia has been +fixed and documented./li + /ul h3 id=sparcSPARC/h3
Re: [wwwdocs] Mention fixed SH -mdiv option for 4.8 and 4.7
On Tue, 26 Mar 2013, Oleg Endo wrote: This one mentions the fixed SH -mdiv option in the changes for 4.8 and 4.7. OK? Looks good to me (though I cannot assert the technical correctness). Gerald
[C++ Patch] Handle separately inline and constexpr in grokfndecl error messages
Hi, I split out two - rather straightforward IMHO - changes from the largish patch I posted a few days ago: this one improves the accuracy of some error messages produced by grokfndecl. Tested x86_64-linux. Thanks, Paolo. // /cp 2013-03-25 Paolo Carlini paolo.carl...@oracle.com * decl.c (grokfndecl): Handle separately inline and constexpr error messages. /testsuite 2013-03-25 Paolo Carlini paolo.carl...@oracle.com * g++.dg/cpp0x/constexpr-friend-2.C: New. * g++.dg/cpp0x/constexpr-main.C: Likewise. Index: cp/decl.c === --- cp/decl.c (revision 197053) +++ cp/decl.c (working copy) @@ -7426,13 +7426,16 @@ grokfndecl (tree ctype, return NULL_TREE; } + if (inlinep 1) + error (%inline% is not allowed in declaration of friend + template specialization %qD, + decl); + if (inlinep 2) + error (%constexpr% is not allowed in declaration of friend + template specialization %qD, + decl); if (inlinep) - { - error (%inline% is not allowed in declaration of friend -template specialization %qD, -decl); - return NULL_TREE; - } + return NULL_TREE; } } @@ -7471,8 +7474,10 @@ grokfndecl (tree ctype, { if (PROCESSING_REAL_TEMPLATE_DECL_P()) error (cannot declare %::main% to be a template); - if (inlinep) + if (inlinep 1) error (cannot declare %::main% to be inline); + if (inlinep 2) + error (cannot declare %::main% to be constexpr); if (!publicp) error (cannot declare %::main% to be static); inlinep = 0; Index: testsuite/g++.dg/cpp0x/constexpr-friend-2.C === --- testsuite/g++.dg/cpp0x/constexpr-friend-2.C (revision 0) +++ testsuite/g++.dg/cpp0x/constexpr-friend-2.C (working copy) @@ -0,0 +1,7 @@ +// { dg-do compile { target c++11 } } + +templatetypename T void f(T); + +template class T class A { + friend constexpr void f(int); // { dg-error 'constexpr' is not allowed } +}; Index: testsuite/g++.dg/cpp0x/constexpr-main.C === --- testsuite/g++.dg/cpp0x/constexpr-main.C (revision 0) +++ testsuite/g++.dg/cpp0x/constexpr-main.C (working copy) @@ -0,0 +1,3 @@ +// { dg-do compile { target c++11 } } + +constexpr int main (); // { dg-error constexpr }
[C++ Patch] Small grokdeclarator clean up
Hi again, this one adds a typedef_p to grokdeclarator and uses it everywhere. Paolo. // 2013-03-25 Paolo Carlini paolo.carl...@oracle.com * decl.c (grokdeclarator): Declare typedef_p and use it everywhere. Index: decl.c === --- decl.c (revision 196374) +++ decl.c (working copy) @@ -8652,6 +8652,7 @@ grokdeclarator (const cp_declarator *declarator, bool parameter_pack_p = declarator? declarator-parameter_pack_p : false; bool template_type_arg = false; bool template_parm_flag = false; + bool typedef_p = decl_spec_seq_has_spec_p (declspecs, ds_typedef); bool constexpr_p = decl_spec_seq_has_spec_p (declspecs, ds_constexpr); const char *errmsg; @@ -8862,7 +8863,7 @@ grokdeclarator (const cp_declarator *declarator, if (dname IDENTIFIER_OPNAME_P (dname)) { - if (decl_spec_seq_has_spec_p (declspecs, ds_typedef)) + if (typedef_p) { error (declaration of %qD as %typedef%, dname); return error_mark_node; @@ -8900,7 +8901,7 @@ grokdeclarator (const cp_declarator *declarator, if (name == NULL) name = decl_context == PARM ? parameter : type name; - if (constexpr_p decl_spec_seq_has_spec_p (declspecs, ds_typedef)) + if (constexpr_p typedef_p) { error (%constexpr% cannot appear in a typedef declaration); return error_mark_node; @@ -9198,7 +9199,7 @@ grokdeclarator (const cp_declarator *declarator, /* Issue errors about use of storage classes for parameters. */ if (decl_context == PARM) { - if (decl_spec_seq_has_spec_p (declspecs, ds_typedef)) + if (typedef_p) { error (typedef declaration invalid in parameter declaration); return error_mark_node; @@ -9242,7 +9243,7 @@ grokdeclarator (const cp_declarator *declarator, ((storage_class storage_class != sc_extern storage_class != sc_static) - || decl_spec_seq_has_spec_p (declspecs, ds_typedef))) + || typedef_p)) { error (multiple storage classes in declaration of %qs, name); thread_p = false; @@ -9256,7 +9257,7 @@ grokdeclarator (const cp_declarator *declarator, (storage_class == sc_register || storage_class == sc_auto)) ; - else if (decl_spec_seq_has_spec_p (declspecs, ds_typedef)) + else if (typedef_p) ; else if (decl_context == FIELD /* C++ allows static class elements. */ @@ -9866,8 +9867,7 @@ grokdeclarator (const cp_declarator *declarator, return error_mark_node; } } - else if (decl_spec_seq_has_spec_p (declspecs, ds_typedef) - current_class_type) + else if (typedef_p current_class_type) { error (cannot declare member %%T::%s% within %qT, ctype, name, current_class_type); @@ -9944,8 +9944,7 @@ grokdeclarator (const cp_declarator *declarator, error (non-member %qs cannot be declared %mutable%, name); storage_class = sc_none; } - else if (decl_context == TYPENAME - || decl_spec_seq_has_spec_p (declspecs, ds_typedef)) + else if (decl_context == TYPENAME || typedef_p) { error (non-object member %qs cannot be declared %mutable%, name); storage_class = sc_none; @@ -9975,7 +9974,7 @@ grokdeclarator (const cp_declarator *declarator, } /* If this is declaring a typedef name, return a TYPE_DECL. */ - if (decl_spec_seq_has_spec_p (declspecs, ds_typedef) decl_context != TYPENAME) + if (typedef_p decl_context != TYPENAME) { tree decl;
Re: [google][4.7]Using CPU mocks to test code coverage of multiversioned functions
On Mon, Mar 25, 2013 at 02:24:21PM -0700, Sriraman Tallam wrote: Does this also mean that Paul's idea of doing: LD_CPU_FEATURES=sse,sse2 ./a.out # run as if only sse and sse2 are available is fraught with risk when used with IFUNC, particularly on x86_64? Shouldn't the IFUNC resolver go through the GOT even in this case. This could work well for the MV testing problem I explained earlier, but if this is not feasible with IFUNC in play I would like my original proposal reconsidered. I haven't been following the thread so can't comment, sorry. I jumped in on seeing Richard's suggestion re LD_PRELOAD, which is a bad idea given glibc's current support for STT_GNU_IFUNC. IFUNC as it stands is not a general purpose feature and interacts badly with other features of ELF shared libraries. Trivial testcases can easily be created that 1) won't work on any architecture. eg. shared library takes address of ifunc, ifunc resolver in main app, resolver uses variable in shared library. 2) only work on x86_64 and powerpc64. eg. shared library takes address of ifunc, ifunc resolver in main app which is PIC, resolver uses variable in main app. 3) won't work with LD_BIND_NOW=1 either of the above examples but shared library doesn't take address, just calls ifunc. The reason for these problems is that ld.so makes a single pass over dynamic relocations. In the simple case of LD_BIND_NOW=1 and an application that uses a single shared library, relocations for the library will be applied first, then relocations for the main app. So if the shared library has relocations against symbols that turn out to be ifunc, and the ifunc resolver lives in the main app, then the resolver will run *before* the main app has been relocated. The resolver had better not have code that requires relocation! Of course, the obvious fix of making ld.so do two passes over relocations, applying ifunc relocations on the second pass, is somewhat counterproductive. Mostly ifunc is used to gain a speedup when running on particular hardware. Two passes would have to slow down application startup.. Nonetheless, I believe that is the correct solution if we want to make ifunc generally useful. What we have at the moment requires quite a lot of care when using ifunc. Accidentally writing code that only works on x86_64 or powerpc64 is very easy, and might lead people to think you own shares in Intel or IBM. -- Alan Modra Australia Development Lab, IBM
Re: [PATCH] Fix -Wformat-security warning in arm.c
On Mon, Mar 25, 2013 at 11:34 AM, Roland McGrath mcgra...@google.com wrote: gcc/ 2013-03-25 Roland McGrath mcgra...@google.com * config/arm/arm.c (arm_print_operand: case 'w'): Use fputs rather than fprintf with a non-constant, non-format string. This is OK. Thanks. Ian
Re: Record missing equivalence
On Mon, Mar 25, 2013 at 12:08 PM, Jeff Law l...@redhat.com wrote: On 03/21/2013 03:44 AM, Richard Biener wrote: + /* If LHS is an SSA_NAME and RHS is a constant and LHS was set +via a widening type conversion, then we may be able to record +additional equivalences. */ + if (lhs + TREE_CODE (lhs) == SSA_NAME + is_gimple_constant (rhs)) + { + gimple defstmt = SSA_NAME_DEF_STMT (lhs); + + if (defstmt + is_gimple_assign (defstmt) + CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (defstmt))) + { + tree old_rhs = gimple_assign_rhs1 (defstmt); + tree newval = fold_convert (TREE_TYPE (old_rhs), rhs); You want to delay that folding and creating of a new tree node until after ... + + /* If this was a widening conversion and if RHS is converted +to the type of OLD_RHS and has the same value, then we +can record an equivalence between OLD_RHS and the +converted representation of RHS. */ + if ((TYPE_PRECISION (TREE_TYPE (lhs)) + TYPE_PRECISION (TREE_TYPE (old_rhs))) ... this check. + operand_equal_p (rhs, newval, 0)) If you'd restricted yourself to handling INTEGER_CSTs then using int_fits_type_p (rhs, TREE_TYPE (lhs)) would have been enough to check. And operand_equal_p will never return for non-equal typed non-INTEGER_CSTs anyway ... Agreed. Addressed via the attached patch which was committed after a bootstrap and regression test on x86_64-unknown-linux-gnu. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 9bdf1e5..9db0629 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,9 @@ +2013-03-25 Jeff Law l...@redhat.com + + * tree-ssa-dom.c (record_equivalences_from_incoming_edge): Rework + slightly to avoid creating and folding useless trees. Simplify + slightly by restricting to INTEGER_CSTs and using int_fits_type_p. + 2013-03-25 Uros Bizjak ubiz...@gmail.com This breaks the bootstrap on Linux/x86: http://gcc.gnu.org/ml/gcc-regression/2013-03/msg00148.html ../../../../../src-trunk/libstdc++-v3/src/c++98/mt_allocator.cc: In member function 'std::size_t __gnu_cxx::__pooltrue::_M_get_thread_id()': ../../../../../src-trunk/libstdc++-v3/src/c++98/mt_allocator.cc:620:3: internal compiler error: tree check: expected integer_type or enumeral_type or boolean_type or real_type or fixed_point_type, have pointer_type in int_fits_type_p, at tree.c:8325 __pooltrue::_M_get_thread_id() ^ 0x8ad2cef tree_check_failed(tree_node const*, char const*, int, char const*, ...) ../../src-trunk/gcc/tree.c:8947 0x81ba464 tree_check5(tree_node const*, char const*, int, char const*, tree_code, tree_code, tree_code, tree_code, tree_code) ../../src-trunk/gcc/tree.h:3987 0x8ad058d int_fits_type_p(tree_node const*, tree_node const*) ../../src-trunk/gcc/tree.c:8325 0x897363b record_equivalences_from_incoming_edge ../../src-trunk/gcc/tree-ssa-dom.c:1156 0x8977b5e dom_opt_enter_block ../../src-trunk/gcc/tree-ssa-dom.c:1769 0x8dccddd walk_dominator_tree(dom_walk_data*, basic_block_def*) ../../src-trunk/gcc/domwalk.c:210 0x8972b9c tree_ssa_dominator_optimize ../../src-trunk/gcc/tree-ssa-dom.c:762 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See http://gcc.gnu.org/bugs.html for instructions. -- H.J.
Re: [wwwdocs] Mention fixed SH -mdiv option for 4.8 and 4.7
Oleg Endo oleg.e...@t-online.de wrote: This one mentions the fixed SH -mdiv option in the changes for 4.8 and 4.7. OK? Looks OK to me. Regards, kaz
Re: Record missing equivalence
On 03/25/2013 06:41 PM, H.J. Lu wrote: This breaks the bootstrap on Linux/x86: http://gcc.gnu.org/ml/gcc-regression/2013-03/msg00148.html ../../../../../src-trunk/libstdc++-v3/src/c++98/mt_allocator.cc: In member function 'std::size_t __gnu_cxx::__pooltrue::_M_get_thread_id()': ../../../../../src-trunk/libstdc++-v3/src/c++98/mt_allocator.cc:620:3: internal compiler error: tree check: expected integer_type or enumeral_type or boolean_type or real_type or fixed_point_type, have pointer_type in int_fits_type_p, at tree.c:8325 __pooltrue::_M_get_thread_id() That looks exactly like something I already fixed. Let me make sure I didn't post/checkin the wrong version of the patch. jeff
Re: [C++ Patch] Handle separately inline and constexpr in grokfndecl error messages
OK. Jason
Re: [C++ Patch] Small grokdeclarator clean up
OK. Jason
Re: Record missing equivalence
On 03/25/2013 06:41 PM, H.J. Lu wrote: This breaks the bootstrap on Linux/x86: http://gcc.gnu.org/ml/gcc-regression/2013-03/msg00148.html ../../../../../src-trunk/libstdc++-v3/src/c++98/mt_allocator.cc: In member function 'std::size_t __gnu_cxx::__pooltrue::_M_get_thread_id()': ../../../../../src-trunk/libstdc++-v3/src/c++98/mt_allocator.cc:620:3: internal compiler error: tree check: expected integer_type or enumeral_type or boolean_type or real_type or fixed_point_type, have pointer_type in int_fits_type_p, at tree.c:8325 __pooltrue::_M_get_thread_id() Definitely the wrong version of the patch. It's missing the INTEGRAL_TYPE_P test that was added to fix this exact problem. This is the delta to get to the version that should have been checked in. * tree-ssa-dom.c (record_equivalences_from_incoming_edge): Add missing check for INTEGRAL_TYPE_P that was missing due to checking in wrong version of prior patch. *** ../../GIT/gcc/gcc/tree-ssa-dom.cMon Mar 25 13:03:11 2013 --- tree-ssa-dom.c Thu Mar 21 07:28:51 2013 *** record_equivalences_from_incoming_edge ( *** 1153,1159 /* If the constant is in the range of the type of OLD_RHS, then convert the constant and record the equivalence. */ ! if (int_fits_type_p (rhs, TREE_TYPE (old_rhs))) { tree newval = fold_convert (TREE_TYPE (old_rhs), rhs); record_equality (old_rhs, newval); --- 1153,1160 /* If the constant is in the range of the type of OLD_RHS, then convert the constant and record the equivalence. */ ! if (INTEGRAL_TYPE_P (TREE_TYPE (old_rhs)) ! int_fits_type_p (rhs, TREE_TYPE (old_rhs))) { tree newval = fold_convert (TREE_TYPE (old_rhs), rhs); record_equality (old_rhs, newval);