Re: PATCH: Check ia32 in GCC tests
On Jul 9, 2011, at 7:22 PM, H.J. Lu wrote: On Thu, Jul 07, 2011 at 10:29:53AM -0700, H.J. Lu wrote: Hi, On Linux/x86-64, when we pass RUNTESTFLAGS=--target_board='unix{-mx32}' to GCC tests, we can't check lp64/ilp32 for availability of 64bit x86 instructions. This patch adds ia32 and x32 effetive targets. OK for trunk? Here is a followup patch to use ia32 effetive target. OK for trunk? Ok.
Re: PATCH: Check ia32 in GCC tests
On Jul 9, 2011, at 7:25 PM, H.J. Lu wrote: 2011-07-09 H.J. Lu hongjiu...@intel.com * gcc.dg/vect/costmodel/x86_64/x86_64-costmodel-vect.exp: Check ia32. * go.test/go-test.exp (go-set-goarch): Likewise. A small update. Ok.
Re: [PATCH] Fix configure --with-cloog
2011/7/6 Romain Geissler romain.geiss...@gmail.com: I forgot configure was a generated script. Here is the patch that fix it at the m4 macro level : 2011-07-06 Romain Geissler romain.geiss...@gmail.com * config/cloog.m4: Add $gmplibs to cloog $LDFLAGS * configure: Regenerate Index: config/cloog.m4 === --- config/cloog.m4 (revision 175907) +++ config/cloog.m4 (working copy) @@ -142,7 +142,7 @@ AC_DEFUN([CLOOG_FIND_FLAGS], dnl clooglibs clooginc may have been initialized by CLOOG_INIT_FLAGS. CFLAGS=${CFLAGS} ${clooginc} ${gmpinc} CPPFLAGS=${CPPFLAGS} ${_cloogorginc} - LDFLAGS=${LDFLAGS} ${clooglibs} + LDFLAGS=${LDFLAGS} ${clooglibs} ${gmplibs} case $cloog_backend in ppl-legacy) Ping: It seems that little patch has been forgotten. Is ok for the trunk ? NB: I don't have write access to the trunk Romain Geissler
Re: plugin event for C/C++ declarations
2011/7/7 Diego Novillo dnovi...@google.com: OK. This one fell through the cracks in my inbox. Apologies. Diego. Hi, I don't have write access, can you please add the patch to the trunk ? Romain Geissler
Re: [PATCH] Remove call_expr_arg and call_expr_argp
2011/7/8 Richard Guenther richard.guent...@gmail.com: Ok. Thanks, Richard. Hi, I don't have write access, can you please add the patch to the trunk ? Romain Geissler
[Patch, Fortran, committed] Remove bogus dg-error in gfortran.dg/coarray_lock_3.f90
Hi all, when committing the LOCK patch, I forgot to include attached change in the testsuite, which causes testsuite failures. I planned to correct that together with other constraint-check issues, but obviously I haven't done so for several weeks. Thus, I decided to start by fixing the test suite. (The line is indeed OK, i.e. the just committed patch is correct.) There are still some issues with LOCK_TYPE checking, in particular with LOCK_TYPES in derived types. Committed as Rev. 176137. Tobias Index: gcc/testsuite/gfortran.dg/coarray_lock_3.f90 === --- gcc/testsuite/gfortran.dg/coarray_lock_3.f90 (revision 176136) +++ gcc/testsuite/gfortran.dg/coarray_lock_3.f90 (working copy) @@ -69,7 +69,7 @@ lock(lock) lock(lock2(1)) lock(lock2) ! { dg-error must be a scalar coarray of type LOCK_TYPE } - lock(lock[1]) ! { dg-error must be a scalar coarray of type LOCK_TYPE } + lock(lock[1]) ! OK end subroutine lock_test2 Index: gcc/testsuite/ChangeLog === --- gcc/testsuite/ChangeLog (revision 176136) +++ gcc/testsuite/ChangeLog (working copy) @@ -1,3 +1,8 @@ +2011-07-11 Tobias Burnus bur...@net-b.de + + PR fortran/18918 + * gfortran.dg/coarray_lock_3.f90: Remove bogus dg-error. + 2011-07-11 Georg-Johann Lay a...@gjlay.de * lib/target-supports.exp (check_effective_target_scheduling):
[Patch, AVR]: Fix PR39633 (missing *cmpqi)
char 7 is compiled to LSL reg SBC reg,reg which leaves cc0 in a mess because Z-flag is not set by SBC, it's propagated from LSL. Patch as obvious, new testcase pass and contains *cmpqi. Ok to commit? Johann gcc/ PR target/39633 * config/avr/avr.c (notice_update_cc): For ashiftrt:QI, only offsets 1..5 set cc0 in a usable way. testsuite/ * gcc.target/avr/torture/pr39633.c: New test case. Index: testsuite/gcc.target/avr/torture/pr39633.c === --- testsuite/gcc.target/avr/torture/pr39633.c (revision 0) +++ testsuite/gcc.target/avr/torture/pr39633.c (revision 0) @@ -0,0 +1,25 @@ +/* { dg-do run } */ + +#include stdlib.h + +char c = 42; + +void __attribute__((noinline,noclone)) +pr39633 (char a) +{ + a = 7; + if (a) +c = a; +} + +int main() +{ + pr39633 (6); + + if (c != 42) +abort(); + + exit(0); + + return 0; +} Index: config/avr/avr.c === --- config/avr/avr.c (revision 176136) +++ config/avr/avr.c (working copy) @@ -1479,9 +1479,8 @@ notice_update_cc (rtx body ATTRIBUTE_UNU { rtx x = XEXP (src, 1); - if (GET_CODE (x) == CONST_INT - INTVAL (x) 0 - INTVAL (x) != 6) + if (CONST_INT_P (x) + IN_RANGE (INTVAL (x), 1, 5)) { cc_status.value1 = SET_DEST (set); cc_status.flags |= CC_OVERFLOW_UNUSABLE;
[PATCH] Extra invariant motion step after ivopt
Hi, with the changes in the IVopts pass from last year I see a reduced number of induction variables used for the first of the 3 hotloops in the 436.cactus benchmark: http://gcc.gnu.org/viewcvs?view=revisionrevision=162653 Which leads to an heavily increased number of instructions in the body of the first loop in the resulting binary: with GCC 4.5: BB 4: 52 - number of instructions with GCC 4.6: BB 4: 110 - similiar result with GCC head With GCC 4.6 a lot of loop invariant integer arithmetic is done in order to calculate the addresses which are used to access the array fields. Adding another invariant motion pass improves the loop even beyond the 4.5 result: with GCC 4.6 + attached patch: BB 4: 47 The benchmark result for 436.cactus only improves by about 2% since the first loop is not actually the hottest in the trio but the code is actually much better. I've not been able to measure the compile time overhead. Out of 10 measurements compiling the cactus testcase the minimum of the compile times was even lower then before. Perhaps having less instructions in the loop body made other passes faster. Overall I expect a very small compile time increase. Ok for mainline? Bye, -Andreas- 2011-07-11 Andreas Krebbel andreas.kreb...@de.ibm.com * passes.c (init_optimization_passes): Add invariant motion pass after induction variable optimization. Index: gcc/passes.c === *** gcc/passes.c.orig --- gcc/passes.c *** init_optimization_passes (void) *** 1363,1368 --- 1363,1369 NEXT_PASS (pass_parallelize_loops); NEXT_PASS (pass_loop_prefetch); NEXT_PASS (pass_iv_optimize); + NEXT_PASS (pass_lim); NEXT_PASS (pass_tree_loop_done); } NEXT_PASS (pass_cse_reciprocals);
RFA: Fix bug in optimize_mode_switching
I work on target with complex mode switching needs, so it can happen that in some block, for an entity a mode is provided without the need for a set. This causes the current optimize_mode_switching to crash when it later dereferences a NULL seginfo pointer. Fixed by using an actual flag to keep track if we have allocated any seginfo. Bootstrappded on x86_64-unknown-linux-gnu. 2011-07-08 Joern Rennecke joern.renne...@embecosm.com * mode-switching.c (optimize_mode_switching): Fix bug in MODE_AFTER handling. Index: mode-switching.c === --- mode-switching.c(revision 1670) +++ mode-switching.c(revision 1671) @@ -499,6 +499,7 @@ optimize_mode_switching (void) { struct seginfo *ptr; int last_mode = no_mode; + bool any_set_required = false; HARD_REG_SET live_now; REG_SET_TO_HARD_REG_SET (live_now, df_get_live_in (bb)); @@ -527,6 +528,7 @@ optimize_mode_switching (void) if (mode != no_mode mode != last_mode) { + any_set_required = true; last_mode = mode; ptr = new_seginfo (mode, insn, bb-index, live_now); add_seginfo (info + bb-index, ptr); @@ -548,8 +550,10 @@ optimize_mode_switching (void) } info[bb-index].computing = last_mode; - /* Check for blocks without ANY mode requirements. */ - if (last_mode == no_mode) + /* Check for blocks without ANY mode requirements. +N.B. because of MODE_AFTER, last_mode might still be different +from no_mode. */ + if (!any_set_required) { ptr = new_seginfo (no_mode, BB_END (bb), bb-index, live_now); add_seginfo (info + bb-index, ptr);
[SPARC] Another minor tweak
Since DWARF2 uses DW_CFA_GNU_window_save and the middle-end REG_CFA_WINDOW_SAVE to designate the thing, this makes the SPARC back-end use the same wording. Tested on SPARC/Solaris, applied on the mainline. 2011-07-11 Eric Botcazou ebotca...@adacore.com * config/sparc/sparc.md (save_register_window_1): Rename to... (window_save): ...this. * config/sparc/sparc.c (emit_save_register_window): Rename to... (emit_window_save): ...this. (sparc_expand_prologue): Adjust to above renaming. -- Eric Botcazou Index: config/sparc/sparc.md === --- config/sparc/sparc.md (revision 176072) +++ config/sparc/sparc.md (working copy) @@ -6276,10 +6276,10 @@ (define_expand prologue DONE; }) -;; The save register window insn is modelled as follows. The dwarf2 -;; information is manually added in emit_save_register_window in sparc.c. +;; The register window save insn is modelled as follows. The dwarf2 +;; information is manually added in emit_window_save. -(define_insn save_register_window_1 +(define_insn window_save [(unspec_volatile [(match_operand 0 arith_operand rI)] UNSPECV_SAVEW)] Index: config/sparc/sparc.c === --- config/sparc/sparc.c (revision 176072) +++ config/sparc/sparc.c (working copy) @@ -4590,14 +4590,12 @@ emit_save_or_restore_local_in_regs (rtx save_local_or_in_reg_p, action, SORR_ADVANCE); } -/* Generate a save_register_window insn. */ +/* Emit a window_save insn. */ static rtx -emit_save_register_window (rtx increment) +emit_window_save (rtx increment) { - rtx insn; - - insn = emit_insn (gen_save_register_window_1 (increment)); + rtx insn = emit_insn (gen_window_save (increment)); RTX_FRAME_RELATED_P (insn) = 1; /* The incoming return address (%o7) is saved in %i7. */ @@ -4716,10 +4714,10 @@ sparc_expand_prologue (void) rtx size_int_rtx = GEN_INT (-size); if (size = 4096) - emit_save_register_window (size_int_rtx); + emit_window_save (size_int_rtx); else if (size = 8192) { - emit_save_register_window (GEN_INT (-4096)); + emit_window_save (GEN_INT (-4096)); /* %sp is not the CFA register anymore. */ emit_insn (gen_stack_pointer_inc (GEN_INT (4096 - size))); } @@ -4727,7 +4725,7 @@ sparc_expand_prologue (void) { rtx size_rtx = gen_rtx_REG (Pmode, 1); emit_move_insn (size_rtx, size_int_rtx); - emit_save_register_window (size_rtx); + emit_window_save (size_rtx); } }
Re: RFA PR regression/49498
On Fri, Jul 8, 2011 at 7:25 PM, Jeff Law l...@redhat.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 As detailed in the PR, improvements to jump threading caused the relatively simple guard predicates in this testcase to become significantly more complex. The predicate complexity is enough to confuse the predicate-aware pruning of bogus uninitialized variable warnings. Note the actual runtime flow control was improved by jump threading, which was doing exactly what it should. Based on David's comments, it's unlikely the predicate-aware code in tree-ssa-uninit.c is going to be able to handle the more complex guards. So I'm turning off DOM (jump threading) for this testcase. OK for trunk? Ok. Thanks, Richard. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJOFz2VAAoJEBRtltQi2kC7qMQH/2GMEXQrFZzWZev2Rd7CH20F x7SsUDtkPW5K5pd1uLJOTsGh7fwr8l173n27GQVR5DN5OCLmoxWDrpsUeaMRd4bg LcZun7h+NGSrqxna/LExs9PBNR1P9blh1X6/LCqmWuo8hIqJ5HDUDK6674iD4C8p I71W25FYPgAno9Okm0UiBKOaZjRJdtfiZqMSgM9HreagYbHQcMYlcWsyc9irXM9b oxkaFzM+Aq5ZxpulpD0NCJ4aGMe6u2+FymrsjbbrPfnB2y7MY1DklxA0L7NO893d dxZ5N3Fi1adDsUP7Oh/0PNGQkB6HjDlAR6gV0oyUAamswn/Owo6lAYvQdNTMUAk= =VYjS -END PGP SIGNATURE-
Re: [Patch, AVR]: Fix PR39633 (missing *cmpqi)
2011/7/11 Georg-Johann Lay a...@gjlay.de: char 7 is compiled to LSL reg SBC reg,reg which leaves cc0 in a mess because Z-flag is not set by SBC, it's propagated from LSL. Patch as obvious, new testcase pass and contains *cmpqi. Ok to commit? Please, commit. Denis.
[PATCH] Remove obsolete alias check in cgraph_redirect_edge_call_stmt_to_callee
Hi, since (same body) aliases have their own cgraph_nodes, the check for them in cgraph_redirect_edge_call_stmt_to_callee is now unnecessary because e-callee is now the alias, not the function node. The following patch therefore removes it. Bootstrapped and tested on x86_64-linux, OK for trunk? Thanks, Martin 2011-07-08 Martin Jambor mjam...@suse.cz * cgraphunit.c (cgraph_redirect_edge_call_stmt_to_callee): Alias check removed. Index: src/gcc/cgraphunit.c === --- src.orig/gcc/cgraphunit.c +++ src/gcc/cgraphunit.c @@ -2380,9 +2380,7 @@ cgraph_redirect_edge_call_stmt_to_callee #endif if (e-indirect_unknown_callee - || decl == e-callee-decl - /* Don't update call from same body alias to the real function. */ - || (decl cgraph_get_node (decl) == cgraph_get_node (e-callee-decl))) + || decl == e-callee-decl) return e-call_stmt; #ifdef ENABLE_CHECKING
Re: [PATCH] Remove call_expr_arg and call_expr_argp
On Mon, Jul 11, 2011 at 9:53 AM, Romain Geissler romain.geiss...@gmail.com wrote: 2011/7/8 Richard Guenther richard.guent...@gmail.com: Ok. Thanks, Richard. Hi, I don't have write access, can you please add the patch to the trunk ? Done. Btw, a proper changelog would have been 2011-07-11 Romain Geissler romain.geiss...@gmail.com * tree.h (call_expr_arg): Remove. (call_expr_argp): Likewise. Romain Geissler
Re: [PATCH] Extra invariant motion step after ivopt
On Mon, Jul 11, 2011 at 10:50 AM, Andreas Krebbel kreb...@linux.vnet.ibm.com wrote: Hi, with the changes in the IVopts pass from last year I see a reduced number of induction variables used for the first of the 3 hotloops in the 436.cactus benchmark: http://gcc.gnu.org/viewcvs?view=revisionrevision=162653 Which leads to an heavily increased number of instructions in the body of the first loop in the resulting binary: with GCC 4.5: BB 4: 52 - number of instructions with GCC 4.6: BB 4: 110 - similiar result with GCC head With GCC 4.6 a lot of loop invariant integer arithmetic is done in order to calculate the addresses which are used to access the array fields. Adding another invariant motion pass improves the loop even beyond the 4.5 result: with GCC 4.6 + attached patch: BB 4: 47 The benchmark result for 436.cactus only improves by about 2% since the first loop is not actually the hottest in the trio but the code is actually much better. I've not been able to measure the compile time overhead. Out of 10 measurements compiling the cactus testcase the minimum of the compile times was even lower then before. Perhaps having less instructions in the loop body made other passes faster. Overall I expect a very small compile time increase. Ok for mainline? Ok. Thanks, Richard. Bye, -Andreas- 2011-07-11 Andreas Krebbel andreas.kreb...@de.ibm.com * passes.c (init_optimization_passes): Add invariant motion pass after induction variable optimization. Index: gcc/passes.c === *** gcc/passes.c.orig --- gcc/passes.c *** init_optimization_passes (void) *** 1363,1368 --- 1363,1369 NEXT_PASS (pass_parallelize_loops); NEXT_PASS (pass_loop_prefetch); NEXT_PASS (pass_iv_optimize); + NEXT_PASS (pass_lim); NEXT_PASS (pass_tree_loop_done); } NEXT_PASS (pass_cse_reciprocals);
Re: [11/11] Fix get_mode_bounds
On 07/06/11 20:37, Richard Henderson wrote: On 07/01/2011 10:42 AM, Bernd Schmidt wrote: get_mode_bounds should also use GET_MODE_PRECISION, but this exposes a problem on ia64 - BImode needs to be handled specially here to work around another preexisting special case in gen_int_mode. Would it be better to remove the trunc_int_for_mode special case? It appears that I added that for ia64 and it's unchanged since... I tried that on ia64. It didn't bootstrap with the special case removed (configure-stage1-target-libgomp failure), and progressed further without the change. (It still failed with /usr/bin/ld: /opt/cfarm/gmp-4.2.4/lib/libgmp.a(errno.o): @gprel relocation against dynamic symbol __gmp_errno /usr/bin/ld: /opt/cfarm/gmp-4.2.4/lib/libgmp.a(errno.o): @gprel relocation against dynamic symbol __gmp_errno /usr/bin/ld: /opt/cfarm/gmp-4.2.4/lib/libgmp.a(errno.o): @gprel relocation against dynamic symbol __gmp_errno /usr/bin/ld: final link failed: Nonrepresentable section on output ) That said, I'm willing to approve the patch as-is. I'll commit it then. Bernd
Re: Ping: The TI C6X port
On 06/06/11 14:53, Gerald Pfeifer wrote: not a direct approval for any of the outstanding patches, but I am happy to report that the steering committee is appointing you maintainer of the C6X port. Please go ahead and add yourself to the MAINTAINERS file as part of the patch that actually adds the port (10/11 if I recall correctly). Internally, the question came up whether that means I can just commit the port once the preliminary patches are approved (which I think is now). Opinions? Bernd
RFA: Use create_*_operand expand_insn for movmisalign
When I added the new optabs insn-expansion routines, I looked for code that checked the predicates before calling GEN_FCN. This patch also uses the routines in two cases where we don't currently check the predicates. The benefits are: 1) We assert that the predicates really do match. 2) We support targets (like ARM) that only support restricted addressing modes. See the allows_mem stuff in maybe_legitimize_operand_same_code. Tested on x86_64-linux-gnu and (with an ARM patch to take advantage of it) on arm-linux-gnueabi. OK to install? Richard gcc/ * expr.c (expand_expr_real_1): Use expand_insn for movmisalign. Index: gcc/expr.c === --- gcc/expr.c 2011-07-11 11:29:58.0 +0100 +++ gcc/expr.c 2011-07-11 11:31:45.0 +0100 @@ -8692,7 +8692,8 @@ expand_expr_real_1 (tree exp, rtx target { addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (exp)); struct mem_address addr; - int icode, align; + enum insn_code icode; + int align; get_address_description (exp, addr); op0 = addr_for_mem_ref (addr, as, true); @@ -8709,18 +8710,15 @@ expand_expr_real_1 (tree exp, rtx target ((icode = optab_handler (movmisalign_optab, mode)) != CODE_FOR_nothing)) { - rtx reg, insn; + struct expand_operand ops[2]; /* We've already validated the memory, and we're creating a - new pseudo destination. The predicates really can't fail. */ - reg = gen_reg_rtx (mode); - - /* Nor can the insn generator. */ - insn = GEN_FCN (icode) (reg, temp); - gcc_assert (insn != NULL_RTX); - emit_insn (insn); - - return reg; + new pseudo destination. The predicates really can't fail, + nor can the generator. */ + create_output_operand (ops[0], NULL_RTX, mode); + create_fixed_operand (ops[1], temp); + expand_insn (icode, 2, ops); + return ops[0].value; } return temp; } @@ -8732,7 +8730,8 @@ expand_expr_real_1 (tree exp, rtx target enum machine_mode address_mode; tree base = TREE_OPERAND (exp, 0); gimple def_stmt; - int icode, align; + enum insn_code icode; + int align; /* Handle expansion of non-aliased memory with non-BLKmode. That might end up in a register. */ if (TREE_CODE (base) == ADDR_EXPR) @@ -8806,17 +8805,15 @@ expand_expr_real_1 (tree exp, rtx target ((icode = optab_handler (movmisalign_optab, mode)) != CODE_FOR_nothing)) { - rtx reg, insn; + struct expand_operand ops[2]; /* We've already validated the memory, and we're creating a - new pseudo destination. The predicates really can't fail. */ - reg = gen_reg_rtx (mode); - - /* Nor can the insn generator. */ - insn = GEN_FCN (icode) (reg, temp); - emit_insn (insn); - - return reg; + new pseudo destination. The predicates really can't fail, + nor can the generator. */ + create_output_operand (ops[0], NULL_RTX, mode); + create_fixed_operand (ops[1], temp); + expand_insn (icode, 2, ops); + return ops[0].value; } return temp; }
Re: PING: PATCH [4/n]: Prepare x32: Permute the conversion and addition if one operand is a constant
On 07/11/2011 02:04 AM, H.J. Lu wrote: With my original change, I got (const:DI (plus:DI (symbol_ref:DI (iplane.1577) [flags 0x2] var_decl 0x70857960 iplane) (const_int -4 [0xfffc]))) I think it is safe to permute the conversion and addition operation if one operand is a constant and we are zero-extending. This is how zero-extending works. Ok, I think I understand what you mean. The key is the XEXP (x, 1) == convert_memory_address_addr_space (to_mode, XEXP (x, 1), as) test. It ensures basically that the constant has 31-bit precision, because otherwise the constant would change from e.g. (const_int -0x7ffc) to (const_int 0x8004) when zero-extending it from SImode to DImode. But I'm not sure it's safe. You have, (zero_extend:DI (plus:SI FOO:SI) (const_int Y)) and you want to convert it to (plus:DI FOO:DI (zero_extend:DI (const_int Y))) (where the zero_extend is folded). Ignore that FOO is a SYMBOL_REF (this piece of code does not assume anything about its shape); if FOO == 0xfffc and Y = 8, the result will be respectively 0x4 (valid) and 0x10004 (invalid). If pointers extend as signed you also have a similar case. If FOO == 0x7ffc and Y = 8, the result of (sign_extend:DI (plus:SI FOO:SI) (const_int Y)) and (plus:DI FOO:DI (sign_extend:DI (const_int Y))) will be respectively 0x8004 (valid) and 0x8004 (invalid). What happens if you just return NULL instead of the assertion (good idea adding it!)? Of course then you need to: 1) check the return values of convert_memory_address_addr_space_1, and propagate NULL up to simplify_unary_operation; 2) check in simplify-rtx.c whether the return value of convert_memory_address_1 is NULL, and only return if the return value is not NULL. This is not yet necessary (convert_memory_address is the last transformation for both SIGN_EXTEND and ZERO_EXTEND) but it is better to keep code clean. Thanks, Paolo
Re: RFA: Use create_*_operand expand_insn for movmisalign
On Mon, Jul 11, 2011 at 12:38 PM, Richard Sandiford richard.sandif...@linaro.org wrote: When I added the new optabs insn-expansion routines, I looked for code that checked the predicates before calling GEN_FCN. This patch also uses the routines in two cases where we don't currently check the predicates. The benefits are: 1) We assert that the predicates really do match. 2) We support targets (like ARM) that only support restricted addressing modes. See the allows_mem stuff in maybe_legitimize_operand_same_code. Tested on x86_64-linux-gnu and (with an ARM patch to take advantage of it) on arm-linux-gnueabi. OK to install? Ok. Thanks, Richard. Richard gcc/ * expr.c (expand_expr_real_1): Use expand_insn for movmisalign. Index: gcc/expr.c === --- gcc/expr.c 2011-07-11 11:29:58.0 +0100 +++ gcc/expr.c 2011-07-11 11:31:45.0 +0100 @@ -8692,7 +8692,8 @@ expand_expr_real_1 (tree exp, rtx target { addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (exp)); struct mem_address addr; - int icode, align; + enum insn_code icode; + int align; get_address_description (exp, addr); op0 = addr_for_mem_ref (addr, as, true); @@ -8709,18 +8710,15 @@ expand_expr_real_1 (tree exp, rtx target ((icode = optab_handler (movmisalign_optab, mode)) != CODE_FOR_nothing)) { - rtx reg, insn; + struct expand_operand ops[2]; /* We've already validated the memory, and we're creating a - new pseudo destination. The predicates really can't fail. */ - reg = gen_reg_rtx (mode); - - /* Nor can the insn generator. */ - insn = GEN_FCN (icode) (reg, temp); - gcc_assert (insn != NULL_RTX); - emit_insn (insn); - - return reg; + new pseudo destination. The predicates really can't fail, + nor can the generator. */ + create_output_operand (ops[0], NULL_RTX, mode); + create_fixed_operand (ops[1], temp); + expand_insn (icode, 2, ops); + return ops[0].value; } return temp; } @@ -8732,7 +8730,8 @@ expand_expr_real_1 (tree exp, rtx target enum machine_mode address_mode; tree base = TREE_OPERAND (exp, 0); gimple def_stmt; - int icode, align; + enum insn_code icode; + int align; /* Handle expansion of non-aliased memory with non-BLKmode. That might end up in a register. */ if (TREE_CODE (base) == ADDR_EXPR) @@ -8806,17 +8805,15 @@ expand_expr_real_1 (tree exp, rtx target ((icode = optab_handler (movmisalign_optab, mode)) != CODE_FOR_nothing)) { - rtx reg, insn; + struct expand_operand ops[2]; /* We've already validated the memory, and we're creating a - new pseudo destination. The predicates really can't fail. */ - reg = gen_reg_rtx (mode); - - /* Nor can the insn generator. */ - insn = GEN_FCN (icode) (reg, temp); - emit_insn (insn); - - return reg; + new pseudo destination. The predicates really can't fail, + nor can the generator. */ + create_output_operand (ops[0], NULL_RTX, mode); + create_fixed_operand (ops[1], temp); + expand_insn (icode, 2, ops); + return ops[0].value; } return temp; }
Re: [PATCH 4/6] Shrink-wrapping
On 07/11/11 13:08, Richard Sandiford wrote: Bernd Schmidt ber...@codesourcery.com writes: On 07/07/11 22:08, Richard Sandiford wrote: Sure, I understand that returns does more than return on ARM. What I meant was: we'd normally want that other stuff to be expressed in rtl alongside the (return) rtx. E.g. something like: (parallel [(return) (set (reg r4) (mem (plus (reg sp) (const_int ... (set (reg r5) (mem (plus (reg sp) (const_int ... (set (reg sp) (plus (reg sp) (const_int ...)))]) I've thought about it some more. Isn't this just a question of definitions? Much like we implicitly clobber call-used registers for a CALL rtx, we might as well define RETURN to restore the intersection between regs_ever_live and call-saved regs? This is what its current usage implies, but I guess it's never been necessary to spell it out explicitly since we don't optimize across branches to the exit block. I don't think we could assume that for all targets. On ARM, (return) restores registers, but on many targets it's done separately. An instruction that does not do this should then use simple_return, which has the appropriate definition (just return, nothing else). For most ports I expect there is no difference, since HAVE_return tends to have a guard that requires no epilogue (as the documentation suggests should be the case). Bernd
Re: [PATCH] Build a bi-arch compiler on s390-linux-gnu
On 03/25/2009 04:30 PM, Andreas Krebbel wrote: 2009-03-23 Arthur Loiret aloi...@debian.org * config.gcc (s390-*-linux*): If 'enabled_targets' is 'all', build a bi-arch compiler defaulting to 31-bit. In this case: (tmake_file): Add s390/t-linux64. * doc/install.texi: Add s390-linux to the list of targets supporting --enable-targets=all. This is ok. Thanks! Now checked in. Matthias
[PATCH][0/N][RFC] Change POINTER_PLUS_EXPR offset type requirements
This is the first patch in a series of patches that will eventually lead to changed requirements for the POINTER_PLUS_EXPR offset operand. The first and foremost goal is to reduce the number of sizetyped computations in our IL (with sizetype being that oddball type that is unsigned but sign-extended). The following patch goes for a canonical type precision for the offset operand (matching the precision of sizetype) but allows both signed (preferred) and unsigned offsets. The patch introduces several wrappers around the concept of valid offset types to be able to convert users without actually switching the implementations to a different set of types. The abstractions include - ptrofftype_p (t) - whether t is a valid type for operand 1 of ppe - convert_to_ptrofftype (t) - shortcut for the fold_convert (...) pattern, allows for advanced promotion rules - common_ptrofftype (t) - needed for the rare case when you combine two pointer-plus-expr offsets - fold_build_pointer_plus[_hwi] - for the common pattern that first converts the offset to a proper type and then builds a pointer-plus-expr (or builds an offset tree from a HWI calculation) The patch actually provides implementations for the desired final set of types. Thus, comments on the abstraction itself and the (choice of) final implementation welcome. I suppose the fold_build_pointer_plus* one is least controversical so I'll start with singling out that and its uses. Thanks, Richard. 2011-06-17 Richard Guenther rguent...@suse.de * expr.c (expand_expr_real_2): Extend the POINTER_PLUS_EXPR offset operand to pointer precision. * tree-cfg.c (verify_expr): Use ptrofftype_p for POINTER_PLUS_EXPR offset verification. (verify_gimple_assign_binary): Likewise. * tree.c (build2_stat): Likewise. (build_common_tree_nodes): Build ptrofftype and uptrofftype. * tree.h (enum size_type_kind): Add PTROFFTYPE and UPTROFFTYPE. (ptrofftype): Define. (uptrofftype): Likewise. (convert_to_ptrofftype_loc): New helper function. (convert_to_ptrofftype): Define. (common_ptrofftype): New helper function. (ptrofftype_p): Likewise. (fold_build_pointer_plus_loc): New helper function. (fold_build_pointer_plus_hwi_loc): Likewise. (fold_build_pointer_plus): Define. (fold_build_pointer_plus_hwi): Likewise. * tree.def (POINTER_PLUS_EXPR): Adjust documentation. Index: trunk/gcc/expr.c === *** trunk.orig/gcc/expr.c 2011-07-11 11:48:46.0 +0200 --- trunk/gcc/expr.c2011-07-11 12:57:40.0 +0200 *** expand_expr_real_2 (sepops ops, rtx targ *** 7428,7442 } case POINTER_PLUS_EXPR: ! /* Even though the sizetype mode and the pointer's mode can be different ! expand is able to handle this correctly and get the correct result out ! of the PLUS_EXPR code. */ ! /* Make sure to sign-extend the sizetype offset in a POINTER_PLUS_EXPR ! if sizetype precision is smaller than pointer precision. */ ! if (TYPE_PRECISION (sizetype) TYPE_PRECISION (type)) ! treeop1 = fold_convert_loc (loc, type, ! fold_convert_loc (loc, ssizetype, ! treeop1)); case PLUS_EXPR: /* If we are adding a constant, a VAR_DECL that is sp, fp, or ap, and something else, make sure we add the register to the constant and --- 7428,7439 } case POINTER_PLUS_EXPR: ! /* Extend/truncate the offset operand to pointer width according ! to its signedness. */ ! if (TYPE_PRECISION (type) != TYPE_PRECISION (TREE_TYPE (treeop1))) ! treeop1 = fold_convert_loc (loc, type, treeop1); ! ! /* Fallthru. */ case PLUS_EXPR: /* If we are adding a constant, a VAR_DECL that is sp, fp, or ap, and something else, make sure we add the register to the constant and Index: trunk/gcc/tree-cfg.c === *** trunk.orig/gcc/tree-cfg.c 2011-07-11 11:48:46.0 +0200 --- trunk/gcc/tree-cfg.c2011-07-11 12:12:42.0 +0200 *** verify_expr (tree *tp, int *walk_subtree *** 2845,2857 error (invalid operand to pointer plus, first operand is not a pointer); return t; } ! /* Check to make sure the second operand is an integer with type of !sizetype. */ ! if (!useless_type_conversion_p (sizetype, !TREE_TYPE (TREE_OPERAND (t, 1 { error (invalid operand to pointer plus, second operand is not an !integer with type of sizetype); return t; } /* FALLTHROUGH */ --- 2845,2855 error (invalid
Re: [PATCH] Make VRP optimize useless conversions
On Fri, 8 Jul 2011, Richard Guenther wrote: On Fri, 8 Jul 2011, Michael Matz wrote: Hi, On Fri, 8 Jul 2011, Richard Guenther wrote: It should be indeed safe with the current handling of conversions, but better be safe. So, like the following? No. The point is that you can't compare the bounds that VRP computes with each other when the outcome affects correctness. Think about a very trivial and stupid VRP, that assigns the range [WIDEST_INT_MIN .. WIDEST_UINT_MAX] to each and every SSA name without looking at types and operations at all (assuming that this reflects the largest int type on the target). It's useless but correct. Of course we wouldn't implement such useless range discovery, but similar situations can arise when some VRP algorithms give up for certain reasons, or computation of tight bounds merely isn't implemented for some operations. Your routines need to work also in the presence of such imprecise ranges. Hence, the check that the intermediate conversion is useless needs to take into account the input value range (that's conservatively correct), and the precision and signedness of the target type (if it can represent all value of the input range the conversion was useless). It must not look at the suspected value range of the destination, precisely because it is conservative only. Ok, indeed conservative is different for what VRP does and for what a transformation must assess. So the following patch makes a conservative attempt at checking the transformation (which of course non-surprisingly matches what the VRP part does). So, more like the following? The following actually works. Bootstrapped and tested on x86_64-unknown-linux-gnu. Can you double-check it? Thanks, Richard. 2011-07-11 Richard Guenther rguent...@suse.de * tree-vrp.c (simplify_conversion_using_ranges): Manually translate the source value-range through the conversion chain. Index: gcc/tree-vrp.c === --- gcc/tree-vrp.c (revision 176030) +++ gcc/tree-vrp.c (working copy) @@ -7347,30 +7347,55 @@ simplify_switch_using_ranges (gimple stm static bool simplify_conversion_using_ranges (gimple stmt) { - tree rhs1 = gimple_assign_rhs1 (stmt); - gimple def_stmt = SSA_NAME_DEF_STMT (rhs1); - value_range_t *final, *inner; + tree innerop, middleop, finaltype; + gimple def_stmt; + value_range_t *innervr; + double_int innermin, innermax, middlemin, middlemax; - /* Obtain final and inner value-ranges for a conversion - sequence (final-type)(intermediate-type)inner-type. */ - final = get_value_range (gimple_assign_lhs (stmt)); - if (final-type != VR_RANGE) -return false; + finaltype = TREE_TYPE (gimple_assign_lhs (stmt)); + middleop = gimple_assign_rhs1 (stmt); + def_stmt = SSA_NAME_DEF_STMT (middleop); if (!is_gimple_assign (def_stmt) || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def_stmt))) return false; - rhs1 = gimple_assign_rhs1 (def_stmt); - if (TREE_CODE (rhs1) != SSA_NAME) + innerop = gimple_assign_rhs1 (def_stmt); + if (TREE_CODE (innerop) != SSA_NAME) return false; - inner = get_value_range (rhs1); - if (inner-type != VR_RANGE) + + /* Get the value-range of the inner operand. */ + innervr = get_value_range (innerop); + if (innervr-type != VR_RANGE + || TREE_CODE (innervr-min) != INTEGER_CST + || TREE_CODE (innervr-max) != INTEGER_CST) return false; - /* If the value-range is preserved by the conversion sequence strip - the intermediate conversion. */ - if (!tree_int_cst_equal (final-min, inner-min) - || !tree_int_cst_equal (final-max, inner-max)) + + /* Simulate the conversion chain to check if the result is equal if + the middle conversion is removed. */ + innermin = tree_to_double_int (innervr-min); + innermax = tree_to_double_int (innervr-max); + middlemin = double_int_ext (innermin, TYPE_PRECISION (TREE_TYPE (middleop)), + TYPE_UNSIGNED (TREE_TYPE (middleop))); + middlemax = double_int_ext (innermax, TYPE_PRECISION (TREE_TYPE (middleop)), + TYPE_UNSIGNED (TREE_TYPE (middleop))); + /* If the middle values do not represent a proper range fail. */ + if (double_int_cmp (middlemin, middlemax, + TYPE_UNSIGNED (TREE_TYPE (middleop))) 0) return false; - gimple_assign_set_rhs1 (stmt, rhs1); + if (!double_int_equal_p (double_int_ext (middlemin, + TYPE_PRECISION (finaltype), + TYPE_UNSIGNED (finaltype)), + double_int_ext (innermin, + TYPE_PRECISION (finaltype), + TYPE_UNSIGNED (finaltype))) + || !double_int_equal_p (double_int_ext (middlemax, +
[ARM] Tighten predicates for misaligned loads and stores
While working on another patch, I noticed that the new misaligned load/store patterns allow REG+CONST addresses before reload, even though the instruction (and its constraints) don't. This patch tightens the predicatese to match the existing vstN patterns. It depends on: http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00795.html which I've justed applied. The patch also makes neon_struct_operand a normal predicate instead of a special predicate. That was a cut--pasto of mine, sorry. Tested on arm-linux-gnueabi. OK to install? Richard gcc/ * config/arm/predicates.md (neon_struct_operand): Make a normal predicate. (neon_struct_or_register_operand): New predicate. * config/arm/neon.md (movmisalignmode): Replace predicates with neon_struct_or_register_operand. (*movmisalignmode_neon_store, *movmisalignmode_neon_load): Use neon_struct_operand instead of memory_operand. Index: gcc/config/arm/predicates.md === --- gcc/config/arm/predicates.md2011-07-11 11:29:58.0 +0100 +++ gcc/config/arm/predicates.md2011-07-11 13:14:25.0 +0100 @@ -732,9 +732,13 @@ (define_special_predicate vect_par_cons return true; }) -(define_special_predicate neon_struct_operand +(define_predicate neon_struct_operand (and (match_code mem) (match_test TARGET_32BIT neon_vector_mem_operand (op, 2 +(define_predicate neon_struct_or_register_operand + (ior (match_operand 0 neon_struct_operand) + (match_operand 0 s_register_operand))) + (define_special_predicate add_operator (match_code plus)) Index: gcc/config/arm/neon.md === --- gcc/config/arm/neon.md 2011-07-11 12:21:19.0 +0100 +++ gcc/config/arm/neon.md 2011-07-11 13:14:25.0 +0100 @@ -372,8 +372,8 @@ (define_split }) (define_expand movmisalignmode - [(set (match_operand:VDQX 0 nonimmediate_operand ) - (unspec:VDQX [(match_operand:VDQX 1 general_operand )] + [(set (match_operand:VDQX 0 neon_struct_or_register_operand) + (unspec:VDQX [(match_operand:VDQX 1 neon_struct_or_register_operand)] UNSPEC_MISALIGNED_ACCESS))] TARGET_NEON !BYTES_BIG_ENDIAN { @@ -386,7 +386,7 @@ (define_expand movmisalignmode }) (define_insn *movmisalignmode_neon_store - [(set (match_operand:VDX 0 memory_operand =Um) + [(set (match_operand:VDX 0 neon_struct_operand=Um) (unspec:VDX [(match_operand:VDX 1 s_register_operand w)] UNSPEC_MISALIGNED_ACCESS))] TARGET_NEON !BYTES_BIG_ENDIAN @@ -394,15 +394,15 @@ (define_insn *movmisalignmode_neon_st [(set_attr neon_type neon_vst1_1_2_regs_vst2_2_regs)]) (define_insn *movmisalignmode_neon_load - [(set (match_operand:VDX 0 s_register_operand =w) - (unspec:VDX [(match_operand:VDX 1 memory_operand Um)] + [(set (match_operand:VDX 0 s_register_operand =w) + (unspec:VDX [(match_operand:VDX 1 neon_struct_operand Um)] UNSPEC_MISALIGNED_ACCESS))] TARGET_NEON !BYTES_BIG_ENDIAN vld1.V_sz_elem\t{%P0}, %A1 [(set_attr neon_type neon_vld1_1_2_regs)]) (define_insn *movmisalignmode_neon_store - [(set (match_operand:VQX 0 memory_operand =Um) + [(set (match_operand:VQX 0 neon_struct_operand=Um) (unspec:VQX [(match_operand:VQX 1 s_register_operand w)] UNSPEC_MISALIGNED_ACCESS))] TARGET_NEON !BYTES_BIG_ENDIAN @@ -410,8 +410,8 @@ (define_insn *movmisalignmode_neon_st [(set_attr neon_type neon_vst1_1_2_regs_vst2_2_regs)]) (define_insn *movmisalignmode_neon_load - [(set (match_operand:VQX 0 s_register_operand =w) - (unspec:VQX [(match_operand:VQX 1 memory_operand Um)] + [(set (match_operand:VQX 0 s_register_operand =w) + (unspec:VQX [(match_operand:VQX 1 neon_struct_operand Um)] UNSPEC_MISALIGNED_ACCESS))] TARGET_NEON !BYTES_BIG_ENDIAN vld1.V_sz_elem\t{%q0}, %A1
[Patch] Add my name to the Write After Approval list.
Hello, As my very first commit to GCC I have added my name to the MAINTAINERS file in the Write After Approval section. -- I'm not overweight, I'm undertall. Index: MAINTAINERS === --- MAINTAINERS (revision 176151) +++ MAINTAINERS (revision 176152) @@ -324,6 +324,7 @@ Christian Bruel christian.br...@st.com Kevin Buettner kev...@redhat.com Andrew Cagney cag...@redhat.com +Daniel Carrera dcarr...@gmail.com Stephane Carrez stcar...@nerim.fr Gabriel Charettegch...@google.com Chandra Chavva ccha...@redhat.com
Re: [PATCH] Make VRP optimize useless conversions
Hi, On Mon, 11 Jul 2011, Richard Guenther wrote: The following actually works. Bootstrapped and tested on x86_64-unknown-linux-gnu. Can you double-check it? Seems sensible. Given this: short s; int i; for (s = 0; s = 127; s++) i += (signed char)(unsigned char)s; return i; (or similar), does it remove the conversions to signed and unsigned char now? And does it _not_ remove them if the upper bound is 128, or the lower bound is -1 ? Similar (now with extensions): signed char c; unsigned u; for (c = 1; c 127; c++) u += (unsigned)(int)c; The conversion to int is not necessary; but it is when the lower bound is -1. Ciao, Michael.
Re: [PATCH] Add -mno-r11 option to suppress load of ppc64 static chain in indirect calls
On Thu, Jul 7, 2011 at 4:19 PM, Richard Guenther richard.guent...@gmail.com wrote: Does XLC have a similar switch whose name we can use? The IBM XL compiler is discussing a similar feature, but it is not implemented yet and does not have a formal command line option name. - David
[PLUGIN] c-family files installation
This patch add a new exception to the plugin header flattering strategy. c-family files can't be installed in the plugin include root directory as some other files like cp/cp-tree.h will look for them in the c-family directory. Furthermore, i had to correct an include in c-pretty-print.h so that it looks for c-common.h in the c-family directory. That way, headers will work out of the box when compiling a plugin, there is no need for additional include directory. Builds and installs fine Ok for the trunk (i have no write access) ? Romain Geissler gcc/c-family/ 2011-07-11 Romain Geissler romain.geiss...@gmail.com * c-pretty-print.h: Search c-common.h in c-family gcc/ 2011-07-11 Romain Geissler romain.geiss...@gmail.com PR plugins/45348 PR plugins/48425 PR plugins/46577 * Makefile.in: Do not flatten c-family directory when installing plugin headers Index: gcc/c-family/c-pretty-print.h === --- gcc/c-family/c-pretty-print.h (revision 175907) +++ gcc/c-family/c-pretty-print.h (working copy) @@ -23,7 +23,7 @@ along with GCC; see the file COPYING3. #define GCC_C_PRETTY_PRINTER #include tree.h -#include c-common.h +#include c-family/c-common.h #include pretty-print.h Index: gcc/Makefile.in === --- gcc/Makefile.in (revision 175907) +++ gcc/Makefile.in (working copy) @@ -4643,7 +4643,7 @@ s-header-vars: Makefile # Install the headers needed to build a plugin. install-plugin: installdirs lang.install-plugin s-header-vars -# We keep the directory structure for files in config and .def files. All +# We keep the directory structure for files in config or c-family and .def files. All # other files are flattened to a single directory. $(mkinstalldirs) $(DESTDIR)$(plugin_includedir) headers=`echo $(PLUGIN_HEADERS) | tr ' ' '\012' | sort -u`; \ @@ -4656,7 +4656,7 @@ install-plugin: installdirs lang.install else continue; \ fi; \ case $$path in \ - $(srcdir)/config/* | $(srcdir)/*.def ) \ + $(srcdir)/config/* | $(srcdir)/c-family/* | $(srcdir)/*.def ) \ base=`echo $$path | sed -e s|$$srcdirstrip/||`;; \ *) base=`basename $$path` ;; \ esac; \
Re: [Patch,testsuite]: Skip AVR if .text overflows
Mike Stump wrote: On Jul 8, 2011, at 7:57 AM, Georg-Johann Lay wrote: These tests are too big for AVR: .text (128 KiB) overflows and ld complains. Ok to commit? Ok. If people feel they have a nice design for `too big', let us know... I think it would have to be ld message scan based, to allow things that are just small enough, and correctly identify those that are just a hair too big. I'd preapprove work in that direction. :-) I don't know enough of dejagnu/tcl/expect to do that. For the moment I'm happy with the explicit skip-avr-solution, and it's just a handfull of tests that fail. Johann
Re: [patch tree-optimization]: [2 of 3]: Boolify compares more
2011/7/8 Richard Guenther richard.guent...@gmail.com: On Fri, Jul 8, 2011 at 1:32 PM, Kai Tietz ktiet...@googlemail.com wrote: 2011/7/8 Richard Guenther richard.guent...@gmail.com: On Thu, Jul 7, 2011 at 6:07 PM, Kai Tietz ktiet...@googlemail.com wrote: Hello, This patch - second of series - adds boolification of comparisions in gimplifier. For this casts from/to boolean are marked as not-useless. And in fold_unary_loc casts to non-boolean integral types are preserved. The hunk in tree-ssa-forwprop.c in combine_cond-expr_cond is not strictly necessary - as long as fold-const handles 1-bit precision bitwise-expression with truth-logic - but it has shown to short-cut some expensier folding. So I kept it within this patch. Please split it out. Also ... The adjusted testcase gcc.dg/uninit-15.c indicates that due optimization we loose in this case variables declaration. But this might be to be expected. In vectorization we have a regression in gcc.dg/vect/vect-cond-3.c test-case. It's caused by always having boolean-type on conditions. So vectorizer sees different types, which aren't handled by vectorizer right now. Maybe this issue could be special-cased for boolean-types in tree-vect-loop, by making operand for used condition equal to vector-type. But this is a subject for a different patch and not addressed by this series. There is a regressions in tree-ssa/vrp47.c, and the fix is addressed by the 3rd patch of this series. Bootstrapped and regression tested for all standard-languages (plus Ada and Obj-C++) on host x86_64-pc-linux-gnu. Ok for apply? Regards, Kai ChangeLog 2011-07-07 Kai Tietz kti...@redhat.com * fold-const.c (fold_unary_loc): Preserve non-boolean-typed casts. * gimplify.c (gimple_boolify): Handle boolification of comparisons. (gimplify_expr): Boolifiy non aggregate-typed comparisons. * tree-cfg.c (verify_gimple_comparison): Check result type of comparison expression. * tree-ssa.c (useless_type_conversion_p): Preserve incompatible casts from/to boolean, * tree-ssa-forwprop.c (combine_cond_expr_cond): Add simplification support for one-bit-precision typed X for cases X != 0 and X == 0. (forward_propagate_comparison): Adjust test of condition result. * gcc.dg/tree-ssa/builtin-expect-5.c: Adjusted. * gcc.dg/tree-ssa/pr21031.c: Likewise. * gcc.dg/tree-ssa/pr30978.c: Likewise. * gcc.dg/tree-ssa/ssa-fre-6.c: Likewise. * gcc.dg/binop-xor1.c: Mark it as expected fail. * gcc.dg/binop-xor3.c: Likewise. * gcc.dg/uninit-15.c: Adjust reported message. Index: gcc-head/gcc/fold-const.c === --- gcc-head.orig/gcc/fold-const.c +++ gcc-head/gcc/fold-const.c @@ -7665,11 +7665,11 @@ fold_unary_loc (location_t loc, enum tre non-integral type. Do not fold the result as that would not simplify further, also folding again results in recursions. */ - if (INTEGRAL_TYPE_P (type)) + if (TREE_CODE (type) == BOOLEAN_TYPE) return build2_loc (loc, TREE_CODE (op0), type, TREE_OPERAND (op0, 0), TREE_OPERAND (op0, 1)); - else + else if (!INTEGRAL_TYPE_P (type)) return build3_loc (loc, COND_EXPR, type, op0, fold_convert (type, boolean_true_node), fold_convert (type, boolean_false_node)); Index: gcc-head/gcc/gimplify.c === --- gcc-head.orig/gcc/gimplify.c +++ gcc-head/gcc/gimplify.c @@ -2842,18 +2842,23 @@ gimple_boolify (tree expr) case TRUTH_NOT_EXPR: TREE_OPERAND (expr, 0) = gimple_boolify (TREE_OPERAND (expr, 0)); - /* FALLTHRU */ - case EQ_EXPR: case NE_EXPR: - case LE_EXPR: case GE_EXPR: case LT_EXPR: case GT_EXPR: /* These expressions always produce boolean results. */ - TREE_TYPE (expr) = boolean_type_node; + if (TREE_CODE (type) != BOOLEAN_TYPE) + TREE_TYPE (expr) = boolean_type_node; return expr; default: + if (COMPARISON_CLASS_P (expr)) + { + /* There expressions always prduce boolean results. */ + if (TREE_CODE (type) != BOOLEAN_TYPE) + TREE_TYPE (expr) = boolean_type_node; + return expr; + } /* Other expressions that get here must have boolean values, but might need to be converted to the appropriate mode. */ - if (type == boolean_type_node) + if (TREE_CODE (type) == BOOLEAN_TYPE) return expr; return fold_convert_loc (loc, boolean_type_node, expr); } @@ -6763,7 +6768,7 @@ gimplify_expr (tree *expr_p, gimple_seq tree org_type = TREE_TYPE (*expr_p);
Re: PING: PATCH [4/n]: Prepare x32: Permute the conversion and addition if one operand is a constant
On Mon, Jul 11, 2011 at 4:03 AM, Paolo Bonzini bonz...@gnu.org wrote: On 07/11/2011 02:04 AM, H.J. Lu wrote: With my original change, I got (const:DI (plus:DI (symbol_ref:DI (iplane.1577) [flags 0x2] var_decl 0x70857960 iplane) (const_int -4 [0xfffc]))) I think it is safe to permute the conversion and addition operation if one operand is a constant and we are zero-extending. This is how zero-extending works. Ok, I think I understand what you mean. The key is the XEXP (x, 1) == convert_memory_address_addr_space (to_mode, XEXP (x, 1), as) test. It ensures basically that the constant has 31-bit precision, because otherwise the constant would change from e.g. (const_int -0x7ffc) to (const_int 0x8004) when zero-extending it from SImode to DImode. But I'm not sure it's safe. You have, (zero_extend:DI (plus:SI FOO:SI) (const_int Y)) and you want to convert it to (plus:DI FOO:DI (zero_extend:DI (const_int Y))) (where the zero_extend is folded). Ignore that FOO is a SYMBOL_REF (this piece of code does not assume anything about its shape); if FOO == 0xfffc and Y = 8, the result will be respectively 0x4 (valid) and 0x10004 (invalid). This example contradicts what you said above It ensures basically that the constant has 31-bit precision. For zero-extend, the issue is address-wrap. As I understand, to support address-wrap, you need to use ptr_mode. If pointers extend as signed you also have a similar case. If FOO == 0x7ffc and Y = 8, the result of (sign_extend:DI (plus:SI FOO:SI) (const_int Y)) and (plus:DI FOO:DI (sign_extend:DI (const_int Y))) will be respectively 0x8004 (valid) and 0x8004 (invalid). What happens if you just return NULL instead of the assertion (good idea adding it!)? Of course then you need to: 1) check the return values of convert_memory_address_addr_space_1, and propagate NULL up to simplify_unary_operation; 2) check in simplify-rtx.c whether the return value of convert_memory_address_1 is NULL, and only return if the return value is not NULL. This is not yet necessary (convert_memory_address is the last transformation for both SIGN_EXTEND and ZERO_EXTEND) but it is better to keep code clean. I will give it a try. Thanks. -- H.J.
Re: Ping: The TI C6X port
On Jul 11, 2011, at 3:18 AM, Bernd Schmidt ber...@codesourcery.com wrote: On 06/06/11 14:53, Gerald Pfeifer wrote: not a direct approval for any of the outstanding patches, but I am happy to report that the steering committee is appointing you maintainer of the C6X port. Please go ahead and add yourself to the MAINTAINERS file as part of the patch that actually adds the port (10/11 if I recall correctly). Internally, the question came up whether that means I can just commit the port once the preliminary patches are approved (which I think is now). Opinions? My take, you need approval for everything outside your area, once you have that, and that work is checked in, then, you can check in all the target bits, self approving those bits, if they meet your standard. Your free to reject them as well. ;-).
[PATCH] Remove cgraph_get_node_or_alias
Hi, cgraph_get_node_or_alias is now completely equivalent to cgraph_get_node, in fact it is exactly same character-by-character. Therefore it should be removed, which is what the patch below does. Bootstrapped and tested on x86_64-linux, OK for trunk? Thanks, Martin 2011-07-11 Martin Jambor mjam...@suse.cz * cgraph.h (cgraph_get_node_or_alias): Removed declaration. * cgraph.c (cgraph_get_node_or_alias): Removed. (change_decl_assembler_name): Changed all calls to cgraph_get_node_or_alias to a call to cgraph_get_node. (cgraph_make_decl_local): Likewise. * lto-symtab.c (lto_symtab_resolve_symbols): Likewise. * varasm.c (default_binds_local_p_1): Likewise. (decl_binds_to_current_def_p): Likewise. Index: src/gcc/cgraph.c === --- src.orig/gcc/cgraph.c +++ src/gcc/cgraph.c @@ -642,29 +642,6 @@ cgraph_add_thunk (struct cgraph_node *de is assigned. */ struct cgraph_node * -cgraph_get_node_or_alias (const_tree decl) -{ - struct cgraph_node key, *node = NULL, **slot; - - gcc_assert (TREE_CODE (decl) == FUNCTION_DECL); - - if (!cgraph_hash) -return NULL; - - key.decl = CONST_CAST2 (tree, const_tree, decl); - - slot = (struct cgraph_node **) htab_find_slot (cgraph_hash, key, -NO_INSERT); - - if (slot *slot) -node = *slot; - return node; -} - -/* Returns the cgraph node assigned to DECL or NULL if no cgraph node - is assigned. */ - -struct cgraph_node * cgraph_get_node (const_tree decl) { struct cgraph_node key, *node = NULL, **slot; @@ -1984,7 +1961,7 @@ change_decl_assembler_name (tree decl, t if (assembler_name_hash TREE_CODE (decl) == FUNCTION_DECL - (node = cgraph_get_node_or_alias (decl)) != NULL) + (node = cgraph_get_node (decl)) != NULL) { tree old_name = DECL_ASSEMBLER_NAME (decl); slot = htab_find_slot_with_hash (assembler_name_hash, old_name, @@ -2002,7 +1979,7 @@ change_decl_assembler_name (tree decl, t } if (assembler_name_hash TREE_CODE (decl) == FUNCTION_DECL - (node = cgraph_get_node_or_alias (decl)) != NULL) + (node = cgraph_get_node (decl)) != NULL) { slot = htab_find_slot_with_hash (assembler_name_hash, name, decl_assembler_name_hash (name), @@ -2525,7 +2502,7 @@ cgraph_make_decl_local (tree decl) old_name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)); if (TREE_CODE (decl) == FUNCTION_DECL) { - struct cgraph_node *node = cgraph_get_node_or_alias (decl); + struct cgraph_node *node = cgraph_get_node (decl); change_decl_assembler_name (decl, clone_function_name (decl, local)); if (node-local.lto_file_data) Index: src/gcc/cgraph.h === --- src.orig/gcc/cgraph.h +++ src/gcc/cgraph.h @@ -480,7 +480,6 @@ struct cgraph_edge *cgraph_create_indire int, gcov_type, int); struct cgraph_indirect_call_info *cgraph_allocate_init_indirect_info (void); struct cgraph_node * cgraph_get_node (const_tree); -struct cgraph_node * cgraph_get_node_or_alias (const_tree); struct cgraph_node * cgraph_create_node (tree); struct cgraph_node * cgraph_get_create_node (tree); struct cgraph_node * cgraph_same_body_alias (struct cgraph_node *, tree, tree); Index: src/gcc/lto-symtab.c === --- src.orig/gcc/lto-symtab.c +++ src/gcc/lto-symtab.c @@ -438,7 +438,7 @@ lto_symtab_resolve_symbols (void **slot) for (e = (lto_symtab_entry_t) *slot; e; e = e-next) { if (TREE_CODE (e-decl) == FUNCTION_DECL) - e-node = cgraph_get_node_or_alias (e-decl); + e-node = cgraph_get_node (e-decl); else if (TREE_CODE (e-decl) == VAR_DECL) e-vnode = varpool_get_node (e-decl); } Index: src/gcc/varasm.c === --- src.orig/gcc/varasm.c +++ src/gcc/varasm.c @@ -6720,7 +6720,7 @@ default_binds_local_p_1 (const_tree exp, } else if (TREE_CODE (exp) == FUNCTION_DECL TREE_PUBLIC (exp)) { - struct cgraph_node *node = cgraph_get_node_or_alias (exp); + struct cgraph_node *node = cgraph_get_node (exp); if (node resolution_local_p (node-resolution)) resolved_locally = true; @@ -6808,7 +6808,7 @@ decl_binds_to_current_def_p (tree decl) } else if (TREE_CODE (decl) == FUNCTION_DECL) { - struct cgraph_node *node = cgraph_get_node_or_alias (decl); + struct cgraph_node *node = cgraph_get_node (decl); if (node node-resolution != LDPR_UNKNOWN) return resolution_to_local_definition_p (node-resolution);
Re: [PATCH] Remove cgraph_get_node_or_alias
Hi, cgraph_get_node_or_alias is now completely equivalent to cgraph_get_node, in fact it is exactly same character-by-character. Therefore it should be removed, which is what the patch below does. Bootstrapped and tested on x86_64-linux, OK for trunk? OK, thanks! Honza
[libgcc] Remove libgcov.c from EXCLUDES
After installing the libgcov move patch, I noticed that I had overlooked in instance in gcc/po/EXCLUDES. This patch removes is, installed as obvious. Rainer 2011-07-09 Rainer Orth r...@cebitec.uni-bielefeld.de * EXCLUDES (libgcov.c): Remove. diff --git a/gcc/po/EXCLUDES b/gcc/po/EXCLUDES --- a/gcc/po/EXCLUDES +++ b/gcc/po/EXCLUDES @@ -40,7 +40,6 @@ gthr-win32.h gthr.h libgcc2.c libgcc2.h -libgcov.c limitx.h limity.h longlong.h -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: Ping: The TI C6X port
On Mon, 11 Jul 2011, Mike Stump wrote: My take, you need approval for everything outside your area, once you have that, and that work is checked in, then, you can check in all the target bits, self approving those bits, if they meet your standard. That's my understanding as well. (With the caveat that if someone is really hating something in your new port, it would be good to take that very seriously. :-) Gerald
ARM: Clear icache when creating a closure
On a multicore ARM, you really do have to clear both caches, not just the dcache. This bug may exist in other ports too. Andrew. 2011-07-11 Andrew Haley a...@redhat.com * src/arm/ffi.c (FFI_INIT_TRAMPOLINE): Clear icache. diff --git a/src/arm/ffi.c b/src/arm/ffi.c index 885a9cb..b2e7667 100644 --- a/src/arm/ffi.c +++ b/src/arm/ffi.c @@ -558,12 +558,16 @@ ffi_closure_free (void *ptr) ({ unsigned char *__tramp = (unsigned char*)(TRAMP); \ unsigned int __fun = (unsigned int)(FUN); \ unsigned int __ctx = (unsigned int)(CTX); \ + unsigned char *insns = (unsigned char *)(CTX); \ *(unsigned int*) __tramp[0] = 0xe92d000f; /* stmfd sp!, {r0-r3} */ \ *(unsigned int*) __tramp[4] = 0xe59f; /* ldr r0, [pc] */ \ *(unsigned int*) __tramp[8] = 0xe59ff000; /* ldr pc, [pc] */ \ *(unsigned int*) __tramp[12] = __ctx; \ *(unsigned int*) __tramp[16] = __fun; \ - __clear_cache((__tramp[0]), (__tramp[19])); \ + __clear_cache((__tramp[0]), (__tramp[19])); /* Clear data mapping. */ \ + __clear_cache(insns, insns + 3 * sizeof (unsigned int)); \ + /* Clear instruction \ +mapping. */\ }) #endif
[PATCH] Fix gfc_trans_pointer_assign_need_temp (PR fortran/49698)
Hi! As the attached testcase (on x86-64) shows, inner_size is initialized to 1 of a wrong type, which results in verify_stmt ICEs because a PLUS has one 64-bit and one 32-bit operand. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux. Ok for trunk/4.6? 2011-07-11 Jakub Jelinek ja...@redhat.com PR fortran/49698 * trans-stmt.c (gfc_trans_pointer_assign_need_temp): Initialize inner_size to gfc_index_one_node instead of integer_one_node. * gfortran.dg/pr49698.f90: New test. --- gcc/fortran/trans-stmt.c.jj 2011-07-07 13:23:57.0 +0200 +++ gcc/fortran/trans-stmt.c2011-07-11 10:53:34.0 +0200 @@ -3323,7 +3323,7 @@ gfc_trans_pointer_assign_need_temp (gfc_ count = gfc_create_var (gfc_array_index_type, count); gfc_add_modify (block, count, gfc_index_zero_node); - inner_size = integer_one_node; + inner_size = gfc_index_one_node; lss = gfc_walk_expr (expr1); rss = gfc_walk_expr (expr2); if (lss == gfc_ss_terminator) --- gcc/testsuite/gfortran.dg/pr49698.f90.jj2011-07-11 11:32:01.0 +0200 +++ gcc/testsuite/gfortran.dg/pr49698.f90 2011-07-11 11:21:53.0 +0200 @@ -0,0 +1,15 @@ +! PR fortran/49698 +! { dg-do compile } +subroutine foo (x, y, z) + type S +integer, pointer :: e = null() + end type S + type T +type(S), dimension(:), allocatable :: a + end type T + type(T) :: x, y + integer :: z, i + forall (i = 1 : z) +y%a(i)%e = x%a(i)%e + end forall +end subroutine foo Jakub
Re: [PATCH] Fix gfc_trans_pointer_assign_need_temp (PR fortran/49698)
On 07/11/2011 06:24 PM, Jakub Jelinek wrote: As the attached testcase (on x86-64) shows, inner_size is initialized to 1 of a wrong type, which results in verify_stmt ICEs because a PLUS has one 64-bit and one 32-bit operand. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux. Ok for trunk/4.6? OK. I would even claim the patch is obvious. Thanks for taking care of this PR. Tobias
Define [CD]TORS_SECTION_ASM_OP on Solaris/x86 with Sun ld
While investigating why many libmudflap execution tests failed on Solaris 11/x86 with Sun ld, but succeeded on Solaris 11/SPARC, I came across the following: The first failure is fail17-frag. With MUDFLAP_OPTIONS=-trace-calls, I see mf: __mfwrap_strcpy mf: check ptr=8050f8c b=995 size=10 read location=`(strcpy src)' mf: violation pc=fee8894d location=(strcpy src) type=1 ptr=8050f8c size=10 *** mudflap violation 1 (check/read): time=1310040104.249932 ptr=8050f8c size=10 pc=fee8894d location=`(strcpy src)' The constant string isn't ever registered here, thus the failure. With gld instead, I see no failure mf: __mfwrap_strcpy mf: check ptr=8048bac b=747 size=10 read location=`(strcpy src)' and the string is registered very early: mf: register ptr=0 size=1 type=0 name='NULL' mf: register ptr=8048bac size=10 type=4 name='string literal' This registration is from tree-mudflap.c (mudflap_enqueue_constant), emitted at the end of mudflap_finish_file via cgraph_build_static_cdtor ('I', ctor_statements, MAX_RESERVED_INIT_PRIORITY-1); With gld, the registration function _GLOBAL__sub_I_00099_0_main is entered into a .ctors section, but with Sun ld I find separate .ctors and .ctors.65436 sections. varasm.c (get_cdtor_priority_section), which is called by default_named_section_asm_out_constructor, states: /* ??? This only works reliably with the GNU linker. */ This is obviously true: Sun ld doesn't coalesce different .ctors.N sections, thus the contructor isn't called. On Solaris 11/SPARC instead, CTORS_SECTION_ASM_OP is defined in sparc/sysv4.h, and thus default_ctor_section_asm_out_constructor is called. The obvious solution is to define [CD]TORS_SECTION_ASM_OP on Solaris/x86 with Sun ld, too. And indeed this fixes all remaining libmudflap failures. On the other hand, there's the question why tree-mudflap.c tries to create a constructor with a non-default priority on a platform with SUPPORTS_INIT_PRIORITY == 0 or at all: it seems that all is fine even with init_priority ignored. Either mudflap_enqueue_constant should check for this condition, or at least the middle-end should emit an error or a warning in this case. I've bootstrapped the following patch without regressions on i386-pc-solaris2.11 (Sun as/ld) and i386-pc-solaris2.8 (GNU as/ld) without regressions. Installed on mainline. Rainer 2011-07-08 Rainer Orth r...@cebitec.uni-bielefeld.de * config/i386/sol2.h [!USE_GLD] (CTORS_SECTION_ASM_OP): Define. (DTORS_SECTION_ASM_OP): Define. diff --git a/gcc/config/i386/sol2.h b/gcc/config/i386/sol2.h --- a/gcc/config/i386/sol2.h +++ b/gcc/config/i386/sol2.h @@ -152,6 +152,13 @@ along with GCC; see the file COPYING3. #undef TARGET_ASM_NAMED_SECTION #define TARGET_ASM_NAMED_SECTION i386_solaris_elf_named_section +/* Unlike GNU ld, Sun ld doesn't coalesce .ctors.N/.dtors.N sections, so + inhibit their creation. Also cf. sparc/sysv4.h. */ +#ifndef USE_GLD +#define CTORS_SECTION_ASM_OP \t.section\t.ctors, \aw\ +#define DTORS_SECTION_ASM_OP \t.section\t.dtors, \aw\ +#endif + /* We do not need NT_VERSION notes. */ #undef X86_FILE_START_VERSION_DIRECTIVE #define X86_FILE_START_VERSION_DIRECTIVE false -- - Rainer Orth, Center for Biotechnology, Bielefeld University
[build] Use libgcc copy of i386/t-crtstuff on Solaris/x86
I noticed that libgcc/configure.ac uses the gcc copy of i386/t-crtstuff on Solaris/x86, while the identical libgcc copy is perfectly fine. This patch corrects this. Bootstrapped without regressions on i386-pc-solaris2.11, installed on mainline. Rainer 2011-07-10 Rainer Orth r...@cebitec.uni-bielefeld.de * configure.ac (i?86-*-solaris2*): Use libgcc copy of i386/t-crtstuff. * configure: Regenerate. diff --git a/libgcc/configure.ac b/libgcc/configure.ac --- a/libgcc/configure.ac +++ b/libgcc/configure.ac @@ -215,9 +215,7 @@ i?86-*-solaris2* | x86_64-*-solaris2.1[[ .zero 8 EOF if AC_TRY_COMMAND(${CC-cc} -shared -nostartfiles -nodefaultlibs -o conftest.so conftest.s 1AS_MESSAGE_LOG_FD); then - # configure expects config files in libgcc/config, so need a relative - # path here. - tmake_file=${tmake_file} ../../gcc/config/i386/t-crtstuff + tmake_file=${tmake_file} i386/t-crtstuff fi ;; esac -- - Rainer Orth, Center for Biotechnology, Bielefeld University
More mudflap fixes for Solaris 11
When testing libmudflap on Solaris 8, 9, and 10 with GNU ld, I found a couple of testsuite failures: * On Solaris 10, several libmudflap.cth tests fail with FAIL: libmudflap.cth/pass37-frag.c (test for excess errors) Excess errors: /vol/gcc/src/hg/trunk/local/libmudflap/testsuite/libmudflap.cth/pass37-frag.c:23 : undefined reference to `sched_yield' Before Solaris 11, one needs -lrt for sched_yield. * On Solaris 9 (which, unlike Solaris 10+, still provides a static libc), many -static tests fail: FAIL: libmudflap.c/fail1-frag.c (-static) (test for excess errors) Excess errors: /vol/gcc/bin/gld-2.21.1: cannot find -ldl collect2: error: ld returned 1 exit status There is no static libdl, of course, and it seems to be unnecessary in the testsuite anyway. In theory, one could avoid adding it to mfconfig.exp (mfconfig_libs), but that complexity is probably unwarranted for the following. * There is no static librt, so all -static tests fail for that reason. Again, one could think about only adding it for the tests that need it, but given that linking statically against system libraries is heavily frowned upon even in Solaris 8/9, I decided against it. Instead, I chose to add mfconfig_libs to the -static check in libmudflap-init, which disables them completely for Solaris. * Not a testsuite issue, but the pth directory is now completely unused/unnecessary, so I don't create it. With this patch, all libmudflap tests (with the exception of 64-bit libmudflap.c++/pass55-frag.cxx) pass on i386-pc-solaris2.11, i386-pc-solaris2.8, sparc-sun-solaris2.8, and x86_64-unknown-linux-gnu. Ok for mainline? Rainer 2011-07-08 Rainer Orth r...@cebitec.uni-bielefeld.de * configure.ac: Don't create pth. Check for library containing sched_yield. * configure: Regenerate. * config.h.in: Regenerate. * testsuite/lib/libmudflap.exp (libmudflap-init): Use mfconfig_libs in -static check. diff --git a/libmudflap/configure.ac b/libmudflap/configure.ac --- a/libmudflap/configure.ac +++ b/libmudflap/configure.ac @@ -112,12 +112,6 @@ else fi AC_SUBST(MF_HAVE_UINTPTR_T) -if test ! -d pth -then - # libmudflapth objects are built in this subdirectory - mkdir pth -fi - AC_CHECK_HEADERS(pthread.h) AC_MSG_CHECKING([for thread model used by GCC]) @@ -150,6 +144,7 @@ AC_SUBST(build_libmudflapth) AC_CHECK_LIB(dl, dlsym) AC_CHECK_FUNC(connect,, AC_CHECK_LIB(socket, connect)) +AC_CHECK_FUNC(sched_yield,, AC_CHECK_LIB(rt, sched_yield)) # Calculate toolexeclibdir # Also toolexecdir, though it's only used in toolexeclibdir diff --git a/libmudflap/testsuite/lib/libmudflap.exp b/libmudflap/testsuite/lib/libmudflap.exp --- a/libmudflap/testsuite/lib/libmudflap.exp +++ b/libmudflap/testsuite/lib/libmudflap.exp @@ -124,9 +124,11 @@ proc libmudflap-init { language } { # If there is no static library then don't run tests with -static. global tool +global mfconfig_libs set opts additional_flags=-static lappend opts additional_flags=-fmudflap lappend opts additional_flags=-lmudflap +lappend opts libs=$mfconfig_libs set src stlm[pid].c set exe stlm[pid].x -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: PING: PATCH [4/n]: Prepare x32: Permute the conversion and addition if one operand is a constant
On Mon, Jul 11, 2011 at 8:54 AM, H.J. Lu hjl.to...@gmail.com wrote: On Mon, Jul 11, 2011 at 4:03 AM, Paolo Bonzini bonz...@gnu.org wrote: On 07/11/2011 02:04 AM, H.J. Lu wrote: With my original change, I got (const:DI (plus:DI (symbol_ref:DI (iplane.1577) [flags 0x2] var_decl 0x70857960 iplane) (const_int -4 [0xfffc]))) I think it is safe to permute the conversion and addition operation if one operand is a constant and we are zero-extending. This is how zero-extending works. Ok, I think I understand what you mean. The key is the XEXP (x, 1) == convert_memory_address_addr_space (to_mode, XEXP (x, 1), as) test. It ensures basically that the constant has 31-bit precision, because otherwise the constant would change from e.g. (const_int -0x7ffc) to (const_int 0x8004) when zero-extending it from SImode to DImode. But I'm not sure it's safe. You have, (zero_extend:DI (plus:SI FOO:SI) (const_int Y)) and you want to convert it to (plus:DI FOO:DI (zero_extend:DI (const_int Y))) (where the zero_extend is folded). Ignore that FOO is a SYMBOL_REF (this piece of code does not assume anything about its shape); if FOO == 0xfffc and Y = 8, the result will be respectively 0x4 (valid) and 0x10004 (invalid). This example contradicts what you said above It ensures basically that the constant has 31-bit precision. For zero-extend, the issue is address-wrap. As I understand, to support address-wrap, you need to use ptr_mode. I am totally confused what the current code /* For addition we can safely permute the conversion and addition operation if one operand is a constant and converting the constant does not change it or if one operand is a constant and we are using a ptr_extend instruction (POINTERS_EXTEND_UNSIGNED 0). We can always safely permute them if we are making the address narrower. */ if (GET_MODE_SIZE (to_mode) GET_MODE_SIZE (from_mode) || (GET_CODE (x) == PLUS CONST_INT_P (XEXP (x, 1)) (XEXP (x, 1) == convert_memory_address_addr_space (to_mode, XEXP (x, 1), as) || POINTERS_EXTEND_UNSIGNED 0))) return gen_rtx_fmt_ee (GET_CODE (x), to_mode, convert_memory_address_addr_space (to_mode, XEXP (x, 0), as), XEXP (x, 1)); is trying to do. It doesn't support address-wrap at all, regardless if converting the constant changes the constant. I think it should be OK to permute if no instructions are allowed, like: if (GET_MODE_SIZE (to_mode) GET_MODE_SIZE (from_mode) || (GET_CODE (x) == PLUS CONST_INT_P (XEXP (x, 1)) POINTERS_EXTEND_UNSIGNED != 0 no_emit)) return gen_rtx_fmt_ee (GET_CODE (x), to_mode, convert_memory_address_addr_space_1 (to_mode, XEXP (x, 0), as, no_emit), XEXP (x, 1)); -- H.J.
Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
On 07/07/11 10:58, Richard Guenther wrote: I think you should assume that series of widenings, (int)(short)char_variable are already combined. Thus I believe you only need to consider a single conversion in valid_types_for_madd_p. Ok, here's my new patch. This version only allows one conversion between the multiply and addition, so assumes that VRP has eliminated any needless ones. That one conversion may either be a truncate, if the mode was too large for the meaningful data, or an extend, which must be of the right flavour. This means that this patch now has the same effect as the last patch, for all valid cases (following you VRP patch), but rejects the cases where the C language (unhelpfully) requires an intermediate temporary to be of the 'wrong' signedness. Hopefully the output will now be the same between both -O0 and -O2, and programmers will continue to have to be careful about casting unsigned variables whenever they expect purely unsigned math. :( Is this one ok? Andrew 2011-07-11 Andrew Stubbs a...@codesourcery.com gcc/ * tree-ssa-math-opts.c (convert_plusminus_to_widen): Permit a single conversion statement separating multiply-and-accumulate. gcc/testsuite/ * gcc.target/arm/wmul-5.c: New file. * gcc.target/arm/no-wmla-1.c: New file. --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/no-wmla-1.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -march=armv7-a } */ + +int +foo (int a, short b, short c) +{ + int bc = b * c; +return a + (short)bc; +} + +/* { dg-final { scan-assembler mul } } */ --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/wmul-5.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -march=armv7-a } */ + +long long +foo (long long a, char *b, char *c) +{ + return a + *b * *c; +} + +/* { dg-final { scan-assembler umlal } } */ --- a/gcc/tree-ssa-math-opts.c +++ b/gcc/tree-ssa-math-opts.c @@ -2135,6 +2135,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, enum tree_code code) { gimple rhs1_stmt = NULL, rhs2_stmt = NULL; + gimple conv1_stmt = NULL, conv2_stmt = NULL, conv_stmt; tree type, type1, type2; tree lhs, rhs1, rhs2, mult_rhs1, mult_rhs2, add_rhs; enum tree_code rhs1_code = ERROR_MARK, rhs2_code = ERROR_MARK; @@ -2175,6 +2176,38 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, else return false; + /* Allow for one conversion statement between the multiply + and addition/subtraction statement. If there are more than + one conversions then we assume they would invalidate this + transformation. If that's not the case then they should have + been folded before now. */ + if (CONVERT_EXPR_CODE_P (rhs1_code)) +{ + conv1_stmt = rhs1_stmt; + rhs1 = gimple_assign_rhs1 (rhs1_stmt); + if (TREE_CODE (rhs1) == SSA_NAME) + { + rhs1_stmt = SSA_NAME_DEF_STMT (rhs1); + if (is_gimple_assign (rhs1_stmt)) + rhs1_code = gimple_assign_rhs_code (rhs1_stmt); + } + else + return false; +} + if (CONVERT_EXPR_CODE_P (rhs2_code)) +{ + conv2_stmt = rhs2_stmt; + rhs2 = gimple_assign_rhs1 (rhs2_stmt); + if (TREE_CODE (rhs2) == SSA_NAME) + { + rhs2_stmt = SSA_NAME_DEF_STMT (rhs2); + if (is_gimple_assign (rhs2_stmt)) + rhs2_code = gimple_assign_rhs_code (rhs2_stmt); + } + else + return false; +} + /* If code is WIDEN_MULT_EXPR then it would seem unnecessary to call is_widening_mult_p, but we still need the rhs returns. @@ -2188,6 +2221,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, type2, mult_rhs2)) return false; add_rhs = rhs2; + conv_stmt = conv1_stmt; } else if (rhs2_code == MULT_EXPR || rhs2_code == WIDEN_MULT_EXPR) { @@ -2195,6 +2229,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, type2, mult_rhs2)) return false; add_rhs = rhs1; + conv_stmt = conv2_stmt; } else return false; @@ -2202,6 +2237,33 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2)) return false; + /* If there was a conversion between the multiply and addition + then we need to make sure it fits a multiply-and-accumulate. + The should be a single mode change which does not change the + value. */ + if (conv_stmt) +{ + tree from_type = TREE_TYPE (gimple_assign_rhs1 (conv_stmt)); + tree to_type = TREE_TYPE (gimple_assign_lhs (conv_stmt)); + int data_size = TYPE_PRECISION (type1) + TYPE_PRECISION (type2); + bool is_unsigned = TYPE_UNSIGNED (type1) TYPE_UNSIGNED (type2); + + if (TYPE_PRECISION (from_type) TYPE_PRECISION (to_type)) + { + /* Conversion is a truncate. */ + if (TYPE_PRECISION (to_type) data_size) + return false; + } + else if (TYPE_PRECISION (from_type) TYPE_PRECISION (to_type)) + { + /* Conversion is an extend. Check
[PATCH] Create smaller DWARF ops for some int_loc_descriptor constants etc. (PR debug/49676)
Hi! While working on the last dwarf2out.c patch, I've noticed we can generate more compact DWARF location descriptions in several cases, e.g. DW_OP_constu 0xf8000 is 7 byts long, while DW_OP_lit31 DW_OP_lit31 DW_OP_shl pushes the same value to the stack and is just 3 bytes long. The patch adjusts {,size_of_}int_loc_descriptor to generate those and similar sequences when shorter (for case of the same length prefers single op over several smaller ops though). In addition to that, it attempts to optimize DW_OP_GNU_const_type into int_loc_descriptor + DW_OP_GNU_convert when possible, and optimizes DW_OP_plus_uconst const into int_loc_descriptor (const) DW_OP_plus if the latter is shorter. On the attached testcase .debug_info on x86_64 -m64 shrunk from 1092 bytes to 986 bytes (i.e. almost 10% reduction, though on artificial testcase), and for -m32 it shrunk from 1440 to 1295 bytes. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2011-07-11 Jakub Jelinek ja...@redhat.com PR debug/49676 * dwarf2out.c (int_shift_loc_descriptor): New function. (int_loc_descriptor): If shorter, emit i as (i shift), shift, DW_OP_shl for suitable shift value. Similarly, try to optimize large negative values using DW_OP_neg of a positive value if shorter. (size_of_int_shift_loc_descriptor): New function. (size_of_int_loc_descriptor): Adjust to match int_loc_descriptor changes. (mem_loc_descriptor) case CONST_INT: Emit zero-extended constants that fit into DWARF2_ADDR_SIZE bytes as int_loc_descriptor + DW_OP_GNU_convert instead of DW_OP_GNU_const_type if the former is shorter. (resolve_addr_in_expr): Optimize DW_OP_plus_uconst with a large addend as added DW_OP_plus if it is shorter. --- gcc/dwarf2out.c.jj 2011-07-11 10:39:50.0 +0200 +++ gcc/dwarf2out.c 2011-07-11 14:38:07.0 +0200 @@ -10135,6 +10135,21 @@ multiple_reg_loc_descriptor (rtx rtl, rt return loc_result; } +static unsigned long size_of_int_loc_descriptor (HOST_WIDE_INT); + +/* Return a location descriptor that designates a constant i, + as a compound operation from constant (i shift), constant shift + and DW_OP_shl. */ + +static dw_loc_descr_ref +int_shift_loc_descriptor (HOST_WIDE_INT i, int shift) +{ + dw_loc_descr_ref ret = int_loc_descriptor (i shift); + add_loc_descr (ret, int_loc_descriptor (shift)); + add_loc_descr (ret, new_loc_descr (DW_OP_shl, 0, 0)); + return ret; +} + /* Return a location descriptor that designates a constant. */ static dw_loc_descr_ref @@ -10146,15 +10161,45 @@ int_loc_descriptor (HOST_WIDE_INT i) defaulting to the LEB encoding. */ if (i = 0) { + int clz = clz_hwi (i); + int ctz = ctz_hwi (i); if (i = 31) op = (enum dwarf_location_atom) (DW_OP_lit0 + i); else if (i = 0xff) op = DW_OP_const1u; else if (i = 0x) op = DW_OP_const2u; - else if (HOST_BITS_PER_WIDE_INT == 32 - || i = 0x) + else if (clz + ctz = HOST_BITS_PER_WIDE_INT - 5 + clz + 5 + 255 = HOST_BITS_PER_WIDE_INT) + /* DW_OP_litX DW_OP_litY DW_OP_shl takes just 3 bytes and + DW_OP_litX DW_OP_const1u Y DW_OP_shl takes just 4 bytes, + while DW_OP_const4u is 5 bytes. */ + return int_shift_loc_descriptor (i, HOST_BITS_PER_WIDE_INT - clz - 5); + else if (clz + ctz = HOST_BITS_PER_WIDE_INT - 8 + clz + 8 + 31 = HOST_BITS_PER_WIDE_INT) + /* DW_OP_const1u X DW_OP_litY DW_OP_shl takes just 4 bytes, + while DW_OP_const4u is 5 bytes. */ + return int_shift_loc_descriptor (i, HOST_BITS_PER_WIDE_INT - clz - 8); + else if (HOST_BITS_PER_WIDE_INT == 32 || i = 0x) op = DW_OP_const4u; + else if (clz + ctz = HOST_BITS_PER_WIDE_INT - 8 + clz + 8 + 255 = HOST_BITS_PER_WIDE_INT) + /* DW_OP_const1u X DW_OP_const1u Y DW_OP_shl takes just 5 bytes, + while DW_OP_constu of constant = 0x1 takes at least + 6 bytes. */ + return int_shift_loc_descriptor (i, HOST_BITS_PER_WIDE_INT - clz - 8); + else if (clz + ctz = HOST_BITS_PER_WIDE_INT - 16 + clz + 16 + (size_of_uleb128 (i) 5 ? 255 : 31) + = HOST_BITS_PER_WIDE_INT) + /* DW_OP_const2u X DW_OP_litY DW_OP_shl takes just 5 bytes, + DW_OP_const2u X DW_OP_const1u Y DW_OP_shl takes 6 bytes, + while DW_OP_constu takes in this case at least 6 bytes. */ + return int_shift_loc_descriptor (i, HOST_BITS_PER_WIDE_INT - clz - 16); + else if (clz + ctz = HOST_BITS_PER_WIDE_INT - 32 + clz + 32 + 31 = HOST_BITS_PER_WIDE_INT + size_of_uleb128 (i) 6) + /* DW_OP_const4u X DW_OP_litY DW_OP_shl takes just 7 bytes. */ + return int_shift_loc_descriptor (i, HOST_BITS_PER_WIDE_INT - clz - 32); else op =
[PATCH, 4.6, PR 49094, committed] Backport of fixes for PR 49094
Hi, I have just committed the following to the 4.6 branch (after re-testing and as the rev. 176166) to fix PR 49094 there too. It's the following two patches which are already in trunk in one: http://gcc.gnu.org/ml/gcc-patches/2011-06/msg02342.html and http://gcc.gnu.org/ml/gcc-patches/2011-06/msg02371.html The first has been explicitely approved for the branch, the second is a straightforward followup. Sorry it took so long, I just nearly forgot to do the backport too. Thanks, Martin 2011-07-11 Martin Jambor mjam...@suse.cz PR tree-optimization/49094 * tree-sra.c (tree_non_mode_aligned_mem_p): New function. (build_accesses_from_assign): Use it. * testsuite/gcc.dg/tree-ssa/pr49094.c: New test. Index: gcc/testsuite/gcc.dg/tree-ssa/pr49094.c === --- gcc/testsuite/gcc.dg/tree-ssa/pr49094.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/pr49094.c (revision 0) @@ -0,0 +1,38 @@ +/* { dg-do run } */ +/* { dg-options -O } */ + +struct in_addr { + unsigned int s_addr; +}; + +struct ip { + unsigned char ip_p; + unsigned short ip_sum; + struct in_addr ip_src,ip_dst; +} __attribute__ ((aligned(1), packed)); + +struct ip ip_fw_fwd_addr; + +int test_alignment( char *m ) +{ + struct ip *ip = (struct ip *) m; + struct in_addr pkt_dst; + pkt_dst = ip-ip_dst ; + if( pkt_dst.s_addr == 0 ) +return 1; + else +return 0; +} + +int __attribute__ ((noinline, noclone)) +intermediary (char *p) +{ + return test_alignment (p); +} + +int +main (int argc, char *argv[]) +{ + ip_fw_fwd_addr.ip_dst.s_addr = 1; + return intermediary ((void *) ip_fw_fwd_addr); +} Index: gcc/tree-sra.c === --- gcc/tree-sra.c (revision 176152) +++ gcc/tree-sra.c (working copy) @@ -1020,6 +1020,27 @@ disqualify_ops_if_throwing_stmt (gimple return false; } +/* Return true iff type of EXP is not sufficiently aligned. */ + +static bool +tree_non_mode_aligned_mem_p (tree exp) +{ + enum machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + unsigned int align; + + if (TREE_CODE (exp) == SSA_NAME + || TREE_CODE (exp) == MEM_REF + || mode == BLKmode + || !STRICT_ALIGNMENT) +return false; + + align = get_object_alignment (exp, BIGGEST_ALIGNMENT); + if (GET_MODE_ALIGNMENT (mode) align) +return true; + + return false; +} + /* Scan expressions occuring in STMT, create access structures for all accesses to candidates for scalarization and remove those candidates which occur in statements or expressions that prevent them from being split apart. Return @@ -1044,7 +1065,10 @@ build_accesses_from_assign (gimple stmt) lacc = build_access_from_expr_1 (lhs, stmt, true); if (lacc) -lacc-grp_assignment_write = 1; +{ + lacc-grp_assignment_write = 1; + lacc-grp_unscalarizable_region |= tree_non_mode_aligned_mem_p (rhs); +} if (racc) { @@ -1052,6 +1076,7 @@ build_accesses_from_assign (gimple stmt) if (should_scalarize_away_bitmap !gimple_has_volatile_ops (stmt) !is_gimple_reg_type (racc-type)) bitmap_set_bit (should_scalarize_away_bitmap, DECL_UID (racc-base)); + racc-grp_unscalarizable_region |= tree_non_mode_aligned_mem_p (lhs); } if (lacc racc
Re: PING: PATCH [4/n]: Prepare x32: Permute the conversion and addition if one operand is a constant
On Mon, Jul 11, 2011 at 9:55 AM, H.J. Lu hjl.to...@gmail.com wrote: On Mon, Jul 11, 2011 at 8:54 AM, H.J. Lu hjl.to...@gmail.com wrote: On Mon, Jul 11, 2011 at 4:03 AM, Paolo Bonzini bonz...@gnu.org wrote: On 07/11/2011 02:04 AM, H.J. Lu wrote: With my original change, I got (const:DI (plus:DI (symbol_ref:DI (iplane.1577) [flags 0x2] var_decl 0x70857960 iplane) (const_int -4 [0xfffc]))) I think it is safe to permute the conversion and addition operation if one operand is a constant and we are zero-extending. This is how zero-extending works. Ok, I think I understand what you mean. The key is the XEXP (x, 1) == convert_memory_address_addr_space (to_mode, XEXP (x, 1), as) test. It ensures basically that the constant has 31-bit precision, because otherwise the constant would change from e.g. (const_int -0x7ffc) to (const_int 0x8004) when zero-extending it from SImode to DImode. But I'm not sure it's safe. You have, (zero_extend:DI (plus:SI FOO:SI) (const_int Y)) and you want to convert it to (plus:DI FOO:DI (zero_extend:DI (const_int Y))) (where the zero_extend is folded). Ignore that FOO is a SYMBOL_REF (this piece of code does not assume anything about its shape); if FOO == 0xfffc and Y = 8, the result will be respectively 0x4 (valid) and 0x10004 (invalid). This example contradicts what you said above It ensures basically that the constant has 31-bit precision. For zero-extend, the issue is address-wrap. As I understand, to support address-wrap, you need to use ptr_mode. I am totally confused what the current code /* For addition we can safely permute the conversion and addition operation if one operand is a constant and converting the constant does not change it or if one operand is a constant and we are using a ptr_extend instruction (POINTERS_EXTEND_UNSIGNED 0). We can always safely permute them if we are making the address narrower. */ if (GET_MODE_SIZE (to_mode) GET_MODE_SIZE (from_mode) || (GET_CODE (x) == PLUS CONST_INT_P (XEXP (x, 1)) (XEXP (x, 1) == convert_memory_address_addr_space (to_mode, XEXP (x, 1), as) || POINTERS_EXTEND_UNSIGNED 0))) return gen_rtx_fmt_ee (GET_CODE (x), to_mode, convert_memory_address_addr_space (to_mode, XEXP (x, 0), as), XEXP (x, 1)); is trying to do. It doesn't support address-wrap at all, regardless if converting the constant changes the constant. I think it should be OK to permute if no instructions are allowed, like: if (GET_MODE_SIZE (to_mode) GET_MODE_SIZE (from_mode) || (GET_CODE (x) == PLUS CONST_INT_P (XEXP (x, 1)) POINTERS_EXTEND_UNSIGNED != 0 no_emit)) return gen_rtx_fmt_ee (GET_CODE (x), to_mode, convert_memory_address_addr_space_1 (to_mode, XEXP (x, 0), as, no_emit), XEXP (x, 1)); This patch implements it. -- H.J. ---2011-07-11 H.J. Lu hongjiu...@intel.com PR middle-end/47727 * explow.c (convert_memory_address_addr_space_1): New. (convert_memory_address_addr_space): Use it. * expr.c (convert_modes_1): New. (convert_modes): Use it. * expr.h (convert_modes_1): New. * rtl.h (convert_memory_address_addr_space_1): New. (convert_memory_address_1): Likewise. * simplify-rtx.c (simplify_unary_operation_1): Call convert_memory_address_1 instead of convert_memory_address. 2011-07-11 H.J. Lu hongjiu...@intel.com PR middle-end/47727 * explow.c (convert_memory_address_addr_space_1): New. (convert_memory_address_addr_space): Use it. * expr.c (convert_modes_1): New. (convert_modes): Use it. * expr.h (convert_modes_1): New. * rtl.h (convert_memory_address_addr_space_1): New. (convert_memory_address_1): Likewise. * simplify-rtx.c (simplify_unary_operation_1): Call convert_memory_address_1 instead of convert_memory_address. diff --git a/gcc/explow.c b/gcc/explow.c index 3c692f4..d2c54ff 100644 --- a/gcc/explow.c +++ b/gcc/explow.c @@ -320,8 +320,9 @@ break_out_memory_refs (rtx x) arithmetic insns can be used. */ rtx -convert_memory_address_addr_space (enum machine_mode to_mode ATTRIBUTE_UNUSED, - rtx x, addr_space_t as ATTRIBUTE_UNUSED) +convert_memory_address_addr_space_1 (enum machine_mode to_mode ATTRIBUTE_UNUSED, +rtx x, addr_space_t as ATTRIBUTE_UNUSED, +bool no_emit ATTRIBUTE_UNUSED) { #ifndef POINTERS_EXTEND_UNSIGNED gcc_assert (GET_MODE (x) ==
[v3] Small testsuite patch for -Wall
Hi, tested x86_64-linux, committed. Paolo. / 2011-07-11 Paolo Carlini paolo.carl...@oracle.com * testsuite/util/testsuite_allocator.h (propagating_allocator:: operator=(const propagating_allocator)): Retun *this. Index: testsuite/util/testsuite_allocator.h === --- testsuite/util/testsuite_allocator.h(revision 176144) +++ testsuite/util/testsuite_allocator.h(working copy) @@ -408,6 +408,7 @@ { static_assert(P2, assigning propagating_allocatorT, true); propagating_allocator(a).swap_base(*this); + return *this; } // postcondition: a.get_personality() == 0
[build] Move crtfastmath to toplevel libgcc
Another low-hanging fruit in the toplevel libgcc move is crtfastmath. The following patch moves the various crtfastmath.c files over to libgcc and removes the remnants of the gcc side of the configuration. Unfortunately, one piece needs to stay behind: crtfastmath.o must remain in gcc/config/i386/t-linux64 (EXTRA_MULTILIB_PARTS): If I remove just crtfastmath.o, the libgcc Makefile detects a mismatch between the extra_parts lists of gcc and libgcc. If I remove the whole variable, the *-*-linux* default from config.gcc kicks in and we get another mismatch ;-( There's one other question here: alpha/t-crtfm uses -frandom-seed=gcc-crtfastmath with this comment: # FIXME drow/20061228 - I have preserved this -frandom-seed option # while migrating this rule from the GCC directory, but I do not # know why it is necessary if no other crt file uses it. Is there any particular reason to either keep this or not to use it in the generic file? This way, only i386 needs to stay separate with its use of -msse -minline-all-stringops. Bootstrapped without regressions on i386-pc-solaris2.11 and x86_64-unknown-linux-gnu. Ok for mainline? Rainer 2011-07-10 Rainer Orth r...@cebitec.uni-bielefeld.de gcc: * config/alpha/crtfastmath.c: Move to ../libgcc/config/alpha. * config/alpha/t-crtfm: Remove. * config/i386/crtfastmath.c: Move to ../libgcc/config/i386. * config/i386/t-crtfm: Remove. * config/ia64/crtfastmath.c: Move to ../libgcc/config/ia64. * config/mips/crtfastmath.c: Move to ../libgcc/config/mips. * config/sparc/crtfastmath.c: Move to ../libgcc/config/sparc. * config/sparc/t-crtfm: Remove. * config.gcc (alpha*-*-linux*): Remove alpha/t-crtfm from tmake_file. (alpha*-*-freebsd*): Likewise. (i[34567]86-*-darwin*): Remove i386/t-crtfm from tmake_file. (x86_64-*-darwin*): Likewise. (i[34567]86-*-linux*): Likewise. (x86_64-*-linux*): Likewise. (x86_64-*-mingw*): Likewise. (ia64*-*-elf*): Remove crtfastmath.o from extra_parts. (ia64*-*-freebsd*): Likewise. (ia64*-*-linux*): Likewise. (mips64*-*-linux*): Likewise. (mips*-*-linux*): Likewise. (sparc-*-linux*): Remove sparc/t-crtfm from tmake_file. (sparc64-*-linux*): Likewise. (sparc64-*-freebsd*): Likewise. libgcc: * config/alpha/crtfastmath.c: New file. * config/i386/crtfastmath.c: New file. * config/ia64/crtfastmath.c: New file. * config/mips/crtfastmath.c: New file. * config/sparc/crtfastmath.c: New file. * config/t-crtfm (crtfastmath.o): Use $(srcdir) to refer to crtfastmath.c. * config/alpha/t-crtfm: Likewise. * config/i386/t-crtfm: Likewise. * config/ia64/t-ia64 (crtfastmath.o): Remove. * config.host (alpha*-*-freebsd*): Add alpha/t-crtfm to tmake_file. Add crtfastmath.o to extra_parts. (i[34567]86-*-darwin*): Add i386/t-crtfm to tmake_file. Add crtfastmath.o to extra_parts. (x86_64-*-darwin*): Likewise. (x86_64-*-mingw*): Likewise. (ia64*-*-elf*): Add t-crtfm to tmake_file. (ia64*-*-freebsd*): Likewise. (ia64*-*-linux*): Likewise. (sparc64-*-freebsd*): Add t-crtfm to tmake_file. Add crtfastmath.o to extra_parts. diff --git a/gcc/config.gcc b/gcc/config.gcc --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -756,13 +756,13 @@ alpha*-*-linux*) tm_file=${tm_file} alpha/elf.h alpha/linux.h alpha/linux-elf.h glibc-stdint.h extra_options=${extra_options} alpha/elf.opt target_cpu_default=MASK_GAS - tmake_file=${tmake_file} alpha/t-crtfm alpha/t-alpha alpha/t-ieee alpha/t-linux + tmake_file=${tmake_file} alpha/t-alpha alpha/t-ieee alpha/t-linux ;; alpha*-*-freebsd*) tm_file=${tm_file} ${fbsd_tm_file} alpha/elf.h alpha/freebsd.h extra_options=${extra_options} alpha/elf.opt target_cpu_default=MASK_GAS - tmake_file=${tmake_file} alpha/t-crtfm alpha/t-alpha alpha/t-ieee + tmake_file=${tmake_file} alpha/t-alpha alpha/t-ieee extra_parts=crtbegin.o crtend.o crtbeginS.o crtendS.o crtbeginT.o ;; alpha*-*-netbsd*) @@ -1208,12 +1208,12 @@ i[34567]86-*-darwin*) need_64bit_isa=yes # Baseline choice for a machine that allows m64 support. with_cpu=${with_cpu:-core2} - tmake_file=${tmake_file} t-slibgcc-dummy i386/t-crtpc i386/t-crtfm + tmake_file=${tmake_file} t-slibgcc-dummy i386/t-crtpc libgcc_tm_file=$libgcc_tm_file i386/darwin-lib.h ;; x86_64-*-darwin*) with_cpu=${with_cpu:-core2} - tmake_file=${tmake_file} ${cpu_type}/t-darwin64 t-slibgcc-dummy i386/t-crtpc i386/t-crtfm + tmake_file=${tmake_file} ${cpu_type}/t-darwin64 t-slibgcc-dummy i386/t-crtpc tm_file=${tm_file} ${cpu_type}/darwin64.h libgcc_tm_file=$libgcc_tm_file
Re: [PATCH] Create smaller DWARF ops for some int_loc_descriptor constants etc. (PR debug/49676)
On 07/11/2011 09:33 AM, Jakub Jelinek wrote: PR debug/49676 * dwarf2out.c (int_shift_loc_descriptor): New function. (int_loc_descriptor): If shorter, emit i as (i shift), shift, DW_OP_shl for suitable shift value. Similarly, try to optimize large negative values using DW_OP_neg of a positive value if shorter. (size_of_int_shift_loc_descriptor): New function. (size_of_int_loc_descriptor): Adjust to match int_loc_descriptor changes. (mem_loc_descriptor) case CONST_INT: Emit zero-extended constants that fit into DWARF2_ADDR_SIZE bytes as int_loc_descriptor + DW_OP_GNU_convert instead of DW_OP_GNU_const_type if the former is shorter. (resolve_addr_in_expr): Optimize DW_OP_plus_uconst with a large addend as added DW_OP_plus if it is shorter. Ok. r~
[Committed, Backport 4.6, AVR]: PR39633 (missing *cmpqi)
Backported to 4.6: http://gcc.gnu.org/viewcvs?view=revisionrevision=176143
PING^2 Re: PATCH: fix collect2 handling of --demangle and --no-demangle
Ping? http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01368.html We had a bug report from a customer that the linker was ignoring the --demangle and --no-demangle options when generating map files. Moreover, it was failing in a host-dependent way; on Windows hosts, it was always emitting demangled names in the map file, while on Linux hosts, it never did. Moreover, on Windows hosts it also ignored the setting of the COLLECT_NO_DEMANGLE environment variable. This turns out to be a problem in collect2, or actually, three problems: (1) By default, collect2 is configured to filter out --demangle and --no-demangle from the linker options, and it tries to do demangling on symbol names in stdout and stderr itself instead. But, it is too stupid to know about the map file. (2) Collect2 is trying to set COLLECT_NO_DEMANGLE to disable demangling in ld, but in a nonportable way that causes it to be always unset instead on Windows. (3) If you configure with --with-demangler-in-ld to try to disable the collect2 demangling, there's another bug that causes it to ignore any explicit --demangle or --no-demangle options and only pay attention to whether or not COLLECT_NO_DEMANGLE is set. The attached patch addresses all three problems: (1) I've flipped the default to --with-demangler-in-ld=yes. Note that configure.ac already takes care not to let this percolate through to collect2 without verifying that the linker is GNU ld and that it is a version that supports --demangle. Perhaps back in 2004 when this option was first added, the ld demangling support was deemed too experimental to make it the default, but that's surely not the case any more. Also, since this has been broken since 2004, I'm not sure there's much reason to be concerned with backwards compatibility, here (2) I fixed the COLLECT_NO_DEMANGLE environment variable setting recipe. (3) I simplified the argument processing for --demangle and --no-demangle to pass them straight through to the linker when HAVE_LD_DEMANGLE is defined. OK to commit? -Sandra 2011-06-17 Sandra Loosemore san...@codesourcery.com gcc/ * configure.ac (demangler_in_ld): Default to yes. * configure: Regenerated. * collect2.c (main): When HAVE_LD_DEMANGLE is defined, don't mess with COLLECT_NO_DEMANGLE, and just pass --demangle and --no-demangle options straight through to ld. When HAVE_LD_DEMANGLE is not defined, set COLLECT_NO_DEMANGLE in a way that has the intended effect on Windows.
Re: [build] Move crtfastmath to toplevel libgcc
On 07/11/2011 10:26 AM, Rainer Orth wrote: There's one other question here: alpha/t-crtfm uses -frandom-seed=gcc-crtfastmath with this comment: # FIXME drow/20061228 - I have preserved this -frandom-seed option # while migrating this rule from the GCC directory, but I do not # know why it is necessary if no other crt file uses it. Is there any particular reason to either keep this or not to use it in the generic file? This way, only i386 needs to stay separate with its use of -msse -minline-all-stringops. This random-seed thing is there for the mangled name we build for the constructor on Tru64. It's not needed for any target for which a .ctors section is supported. It also doesn't hurt, so you could move it to any generic build rule. r~
[PATCH, i386]: ix86_trampoline_init: use offset everywhere
Hello! A small cleanup, no functional change. This allows us to assert that generated code length is less than TRAMPOLINE_SIZE also for 32bit targets. 2011-07-11 Uros Bizjak ubiz...@gmail.com * config/i386/i386.c (ix86_trampoline_init): Switch arms of if expr. Use offset everywhere. Always assert that offset = TRAMPOLINE_SIZE. Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline. Uros. Index: i386.c === --- i386.c (revision 176159) +++ i386.c (working copy) @@ -22683,54 +22683,14 @@ static void ix86_trampoline_init (rtx m_tramp, tree fndecl, rtx chain_value) { rtx mem, fnaddr; + int opcode; + int offset = 0; fnaddr = XEXP (DECL_RTL (fndecl), 0); - if (!TARGET_64BIT) -{ - rtx disp, chain; - int opcode; - - /* Depending on the static chain location, either load a register -with a constant, or push the constant to the stack. All of the -instructions are the same size. */ - chain = ix86_static_chain (fndecl, true); - if (REG_P (chain)) - { - if (REGNO (chain) == CX_REG) - opcode = 0xb9; - else if (REGNO (chain) == AX_REG) - opcode = 0xb8; - else - gcc_unreachable (); - } - else - opcode = 0x68; - - mem = adjust_address (m_tramp, QImode, 0); - emit_move_insn (mem, gen_int_mode (opcode, QImode)); - - mem = adjust_address (m_tramp, SImode, 1); - emit_move_insn (mem, chain_value); - - /* Compute offset from the end of the jmp to the target function. -In the case in which the trampoline stores the static chain on -the stack, we need to skip the first insn which pushes the -(call-saved) register static chain; this push is 1 byte. */ - disp = expand_binop (SImode, sub_optab, fnaddr, - plus_constant (XEXP (m_tramp, 0), - MEM_P (chain) ? 9 : 10), - NULL_RTX, 1, OPTAB_DIRECT); - - mem = adjust_address (m_tramp, QImode, 5); - emit_move_insn (mem, gen_int_mode (0xe9, QImode)); - - mem = adjust_address (m_tramp, SImode, 6); - emit_move_insn (mem, disp); -} - else + if (TARGET_64BIT) { - int offset = 0, size; + int size; /* Load the function address to r11. Try to load address using the shorter movl instead of movabs. We may want to support @@ -22757,20 +22717,22 @@ ix86_trampoline_init (rtx m_tramp, tree offset += 10; } - /* Load static chain using movabs to r10. */ - mem = adjust_address (m_tramp, HImode, offset); - /* Use the shorter movl instead of movabs for x32. */ + /* Load static chain using movabs to r10. Use the +shorter movl instead of movabs for x32. */ if (TARGET_X32) { + opcode = 0xba41; size = 6; - emit_move_insn (mem, gen_int_mode (0xba41, HImode)); } else { + opcode = 0xba49; size = 10; - emit_move_insn (mem, gen_int_mode (0xba49, HImode)); } + mem = adjust_address (m_tramp, HImode, offset); + emit_move_insn (mem, gen_int_mode (opcode, HImode)); + mem = adjust_address (m_tramp, ptr_mode, offset + 2); emit_move_insn (mem, chain_value); offset += size; @@ -22780,10 +22742,56 @@ ix86_trampoline_init (rtx m_tramp, tree mem = adjust_address (m_tramp, SImode, offset); emit_move_insn (mem, gen_int_mode (0x90e3ff49, SImode)); offset += 4; +} + else +{ + rtx disp, chain; - gcc_assert (offset = TRAMPOLINE_SIZE); + /* Depending on the static chain location, either load a register +with a constant, or push the constant to the stack. All of the +instructions are the same size. */ + chain = ix86_static_chain (fndecl, true); + if (REG_P (chain)) + { + switch (REGNO (chain)) + { + case AX_REG: + opcode = 0xb8; break; + case CX_REG: + opcode = 0xb9; break; + default: + gcc_unreachable (); + } + } + else + opcode = 0x68; + + mem = adjust_address (m_tramp, QImode, offset); + emit_move_insn (mem, gen_int_mode (opcode, QImode)); + + mem = adjust_address (m_tramp, SImode, offset + 1); + emit_move_insn (mem, chain_value); + offset += 5; + + mem = adjust_address (m_tramp, QImode, offset); + emit_move_insn (mem, gen_int_mode (0xe9, QImode)); + + mem = adjust_address (m_tramp, SImode, offset + 1); + + /* Compute offset from the end of the jmp to the target function. +In the case in which the trampoline stores the static chain on +the stack, we need to skip the first insn which pushes the +(call-saved) register static chain; this push is 1 byte.
[Patch, Fortran] Allocate + CAF library
Hello, This is my largest patch so far and the first that I'll commit myself. This patch improves support for the ALLOCATE statement when using the coarray library. Specifically, it adds support for the stat= and errmsg= attributes: ALLOCATE( x(n)[*] , stat=i , errmsg=str ) These attributes are now written by the CAF library. This patch also involves a good amount of code cleanup. ChangeLog is attached. As soon as I get the go-ahead, I'll commit this patch. Cheers, Daniel. -- I'm not overweight, I'm undertall. Index: gcc/fortran/trans-array.c === --- gcc/fortran/trans-array.c (revision 176148) +++ gcc/fortran/trans-array.c (working copy) @@ -4366,7 +4366,8 @@ gfc_array_init_size (tree descriptor, in /*GCC ARRAYS*/ bool -gfc_array_allocate (gfc_se * se, gfc_expr * expr, tree pstat) +gfc_array_allocate (gfc_se * se, gfc_expr * expr, tree status, tree errmsg, + tree errlen) { tree tmp; tree pointer; @@ -4460,22 +4461,15 @@ gfc_array_allocate (gfc_se * se, gfc_exp error = build_call_expr_loc (input_location, gfor_fndecl_runtime_error, 1, msg); - if (pstat != NULL_TREE !integer_zerop (pstat)) + if (status != NULL_TREE) { - /* Set the status variable if it's present. */ + tree status_type = TREE_TYPE (status); stmtblock_t set_status_block; - tree status_type = pstat ? TREE_TYPE (TREE_TYPE (pstat)) : NULL_TREE; gfc_start_block (set_status_block); - gfc_add_modify (set_status_block, - fold_build1_loc (input_location, INDIRECT_REF, - status_type, pstat), - build_int_cst (status_type, LIBERROR_ALLOCATION)); - - tmp = fold_build2_loc (input_location, EQ_EXPR, boolean_type_node, - pstat, build_int_cst (TREE_TYPE (pstat), 0)); - error = fold_build3_loc (input_location, COND_EXPR, void_type_node, tmp, - error, gfc_finish_block (set_status_block)); + gfc_add_modify (set_status_block, status, + build_int_cst (status_type, LIBERROR_ALLOCATION)); + error = gfc_finish_block (set_status_block); } gfc_start_block (elseblock); @@ -4484,14 +4478,15 @@ gfc_array_allocate (gfc_se * se, gfc_exp pointer = gfc_conv_descriptor_data_get (se-expr); STRIP_NOPS (pointer); - /* The allocate_array variants take the old pointer as first argument. */ + /* The allocatable variant takes the old pointer as first argument. */ if (allocatable) -tmp = gfc_allocate_allocatable_with_status (elseblock, - pointer, size, pstat, expr); +tmp = gfc_allocate_allocatable (elseblock, pointer, size, +status, errmsg, errlen, expr); else -tmp = gfc_allocate_with_status (elseblock, size, pstat, false); - tmp = fold_build2_loc (input_location, MODIFY_EXPR, void_type_node, pointer, - tmp); +tmp = gfc_allocate_using_malloc (elseblock, size, status); + + tmp = fold_build2_loc (input_location, MODIFY_EXPR, void_type_node, + pointer, tmp); gfc_add_expr_to_block (elseblock, tmp); Index: gcc/fortran/trans-array.h === --- gcc/fortran/trans-array.h (revision 176148) +++ gcc/fortran/trans-array.h (working copy) @@ -24,7 +24,7 @@ tree gfc_array_deallocate (tree, tree, g /* Generate code to initialize an allocate an array. Statements are added to se, which should contain an expression for the array descriptor. */ -bool gfc_array_allocate (gfc_se *, gfc_expr *, tree); +bool gfc_array_allocate (gfc_se *, gfc_expr *, tree, tree, tree); /* Allow the bounds of a loop to be set from a callee's array spec. */ void gfc_set_loop_bounds_from_array_spec (gfc_interface_mapping *, Index: gcc/fortran/trans-openmp.c === --- gcc/fortran/trans-openmp.c (revision 176148) +++ gcc/fortran/trans-openmp.c (working copy) @@ -188,9 +188,9 @@ gfc_omp_clause_default_ctor (tree clause size = fold_build2_loc (input_location, MULT_EXPR, gfc_array_index_type, size, esize); size = gfc_evaluate_now (fold_convert (size_type_node, size), cond_block); - ptr = gfc_allocate_allocatable_with_status (cond_block, - build_int_cst (pvoid_type_node, 0), - size, NULL, NULL); + ptr = gfc_allocate_allocatable (cond_block, + build_int_cst (pvoid_type_node, 0), + size, NULL_TREE, NULL_TREE, NULL_TREE, NULL); gfc_conv_descriptor_data_set (cond_block, decl, ptr); then_b = gfc_finish_block (cond_block); @@ -241,9 +241,9 @@ gfc_omp_clause_copy_ctor (tree clause, t size = fold_build2_loc (input_location, MULT_EXPR, gfc_array_index_type, size, esize); size = gfc_evaluate_now (fold_convert (size_type_node, size), block); - ptr = gfc_allocate_allocatable_with_status (block, - build_int_cst (pvoid_type_node, 0), - size, NULL, NULL); + ptr = gfc_allocate_allocatable (block, + build_int_cst (pvoid_type_node, 0),
[pph] Stream out chains backwards (issue4657092)
**This patch goes on top of patches in issues 4672055 and 4675069 (which have yet to be committed)** Some things are built as soon as a tree is streamed in, and since the chains are backwards, flipping them after streaming them in is not sufficient as some things (e.g. unique numbers given to functions) have already been allocated. The solution is to stream out the chain backwards to begin with. This fixes the assembly diffs in which the LFB# were different in the pph and non-pph assembly. As noted by a FIXME comment, we probably want to do this for usings and using_directives as well, but I didn't for this patch as we don't handle those yet, and I'm not sure whether their chain is flipped or not. Fixed tests x1functions and c1pr36533. The c1pr44948-1a test changed from an ICE in lto_streamer_cache_get to an ICE in lto_get_pickled_tree, but lto_get_pickled_tree calls lto_streamer_cache_get and I'm pretty sure this is the same bug, not a new one introduced by this patch. Tested with bootstrap build and pph regression testing. 2011-07-11 Gabriel Charette gch...@google.com * pph-streamer-in.c (pph_add_bindings_to_namespace): Don't reverse names and namespaces chains. Reverse names and namespaces only for the binding levels of namespaces streamed in as is. * pph-streamer-out.c (pph_out_chained_tree): New. (pph_out_chain_filtered): Add REVERSE parameter. (pph_out_binding_level): Use REVERSE parameter of pph_out_chain_filtered. * g++.dg/pph/c1pr36533.cc: Expect no asm difference. * g++.dg/pph/c1pr44948-1a.cc: Adjust XFAIL pattern. * g++.dg/pph/x1functions.cc: Expect no asm difference. diff --git a/gcc/cp/pph-streamer-in.c b/gcc/cp/pph-streamer-in.c index 55f7e12..fde1b93 100644 --- a/gcc/cp/pph-streamer-in.c +++ b/gcc/cp/pph-streamer-in.c @@ -1146,11 +1146,6 @@ pph_add_bindings_to_namespace (struct cp_binding_level *bl, tree ns) { tree t, chain; - /* The chains are built backwards (ref: add_decl_to_level), - reverse them before putting them back in. */ - bl-names = nreverse (bl-names); - bl-namespaces = nreverse (bl-namespaces); - for (t = bl-names; t; t = chain) { /* Pushing a decl into a scope clobbers its DECL_CHAIN. @@ -1164,11 +1159,26 @@ pph_add_bindings_to_namespace (struct cp_binding_level *bl, tree ns) for (t = bl-namespaces; t; t = chain) { + struct cp_binding_level* ns_lvl; + /* Pushing a decl into a scope clobbers its DECL_CHAIN. Preserve it. */ chain = DECL_CHAIN (t); pushdecl_into_namespace (t, ns); - pph_add_bindings_to_namespace (NAMESPACE_LEVEL (t), t); + + /* FIXME pph: verify whether this namespace exists already, +if it does we should merge it. */ + ns_lvl = NAMESPACE_LEVEL (t); + /* FIXME pph: the only benefit of making this call is the embedded call to +varpool_finalize_decl for the names contained in this namespace and +it's transitive closure of namespaces, the bindings themselves do NOT +need to be added to this namespace as they are already part of it. */ + pph_add_bindings_to_namespace (ns_lvl, t); + /* Adding the bindings to another namespace automatically reverses them, but +since these were already part of this namespace, they weren't: reverse +them in place now. */ + ns_lvl-names = nreverse (ns_lvl-names); + ns_lvl-namespaces = nreverse (ns_lvl-namespaces); } } diff --git a/gcc/cp/pph-streamer-out.c b/gcc/cp/pph-streamer-out.c index d1e757f..445fca5 100644 --- a/gcc/cp/pph-streamer-out.c +++ b/gcc/cp/pph-streamer-out.c @@ -584,21 +584,44 @@ pph_out_label_binding (pph_stream *stream, cp_label_binding *lb, bool ref_p) } +/* Outputs chained tree T by nulling out it's chain first and restoring it + after the streaming is done. STREAM and REF_P are as in + pph_out_chain_filtered. */ + +static inline void +pph_out_chained_tree (pph_stream *stream, tree t, bool ref_p) +{ + tree saved_chain; + + saved_chain = TREE_CHAIN (t); + TREE_CHAIN (t) = NULL_TREE; + + pph_out_tree_or_ref_1 (stream, t, ref_p, 2); + + TREE_CHAIN (t) = saved_chain; +} + + /* Output a chain of nodes to STREAM starting with FIRST. Skip any nodes that do not match FILTER. REF_P is true if nodes in the chain - should be emitted as references. */ + should be emitted as references. Stream the chain in the reverse order + if REVERSE is true.*/ static void pph_out_chain_filtered (pph_stream *stream, tree first, bool ref_p, - enum chain_filter filter) + enum chain_filter filter, bool reverse) { unsigned count; + int i; tree t; + tree *to_stream = NULL; /* Special case. If the caller wants no filtering, it is much faster to just call pph_out_chain directly. */ if (filter == NONE) { + if (reverse) + nreverse (first);
[gomp-3_1-branch] Update openmp_version in gcc/fortran/intrinsic.texi
The attached patch updates the version number in gfortran's intrinsic documentation. I don't know whether it makes sense to keep the number, but if one does, it should be up to date. Jakub, is the attached patch OK for the branch? Tobias 2011-07-11 Tobias Burnus bur...@net-b.de * intrinsic.c (OMP_LIB): Updated openmp_version's value to 201107. Index: intrinsic.texi === --- intrinsic.texi (Revision 176173) +++ intrinsic.texi (Arbeitskopie) @@ -13100,7 +13100,7 @@ @code{OMP_LIB} provides the scalar default-integer named constant @code{openmp_version} with a value of the form @var{mm}, where @code{} is the year and @var{mm} the month -of the OpenMP version; for OpenMP v3.0 the value is @code{200805}. +of the OpenMP version; for OpenMP v3.1 the value is @code{201107}. And the following scalar integer named constants of the kind @code{omp_sched_kind}:
Re: [gomp-3_1-branch] Update openmp_version in gcc/fortran/intrinsic.texi
On Mon, Jul 11, 2011 at 08:26:22PM +0200, Tobias Burnus wrote: The attached patch updates the version number in gfortran's intrinsic documentation. I don't know whether it makes sense to keep the number, but if one does, it should be up to date. Jakub, is the attached patch OK for the branch? Yeah, thanks. 2011-07-11 Tobias Burnus bur...@net-b.de * intrinsic.c (OMP_LIB): Updated openmp_version's value to 201107. Jakub
[Committed, Backport 4.6, AVR]: PR target/46779
Backported fix for PR46779 to 4.6: http://gcc.gnu.org/viewcvs?root=gccview=revrev=176055
[v3] Fix libstdc++/49559
Hi, for details, see the audit trail. Compared to the last draft, I also fixed __rotate_adaptive (issue noticed by artificially reducing the size of the buffer). Tested x86_64-linux, committed. Thanks, Paolo. / 2011-07-11 Paolo Carlini paolo.carl...@oracle.com PR libstdc++/49559 * include/bits/stl_algo.h (__move_merge_backward): Remove. (__move_merge_adaptive, __move_merge_adaptive_backward): New. (__merge_adaptive): Use the latter two. (__rotate_adaptive): Avoid self move-assignment. * include/bits/stl_algobase.h (move_backward): Fix comment. * testsuite/25_algorithms/stable_sort/49559.cc: New. * testsuite/25_algorithms/inplace_merge/49559.cc: Likewise. * testsuite/25_algorithms/inplace_merge/moveable.cc: Extend. * testsuite/25_algorithms/inplace_merge/moveable2.cc: Likewise. * testsuite/util/testsuite_rvalref.h (rvalstruct::operator= (rvalstruct)): Check for self move-assignment. Index: include/bits/stl_algobase.h === --- include/bits/stl_algobase.h (revision 176144) +++ include/bits/stl_algobase.h (working copy) @@ -641,7 +641,7 @@ * loop count will be known (and therefore a candidate for compiler * optimizations such as unrolling). * - * Result may not be in the range [first,last). Use move instead. Note + * Result may not be in the range (first,last]. Use move instead. Note * that the start of the output range may overlap [first,last). */ templatetypename _BI1, typename _BI2 Index: include/bits/stl_algo.h === --- include/bits/stl_algo.h (revision 176144) +++ include/bits/stl_algo.h (working copy) @@ -2716,20 +2716,76 @@ // merge - /// This is a helper function for the merge routines. + /// This is a helper function for the __merge_adaptive routines. + templatetypename _InputIterator1, typename _InputIterator2, + typename _OutputIterator +void +__move_merge_adaptive(_InputIterator1 __first1, _InputIterator1 __last1, + _InputIterator2 __first2, _InputIterator2 __last2, + _OutputIterator __result) +{ + while (__first1 != __last1 __first2 != __last2) + { + if (*__first2 *__first1) + { + *__result = _GLIBCXX_MOVE(*__first2); + ++__first2; + } + else + { + *__result = _GLIBCXX_MOVE(*__first1); + ++__first1; + } + ++__result; + } + if (__first1 != __last1) + _GLIBCXX_MOVE3(__first1, __last1, __result); +} + + /// This is a helper function for the __merge_adaptive routines. + templatetypename _InputIterator1, typename _InputIterator2, + typename _OutputIterator, typename _Compare +void +__move_merge_adaptive(_InputIterator1 __first1, _InputIterator1 __last1, + _InputIterator2 __first2, _InputIterator2 __last2, + _OutputIterator __result, _Compare __comp) +{ + while (__first1 != __last1 __first2 != __last2) + { + if (__comp(*__first2, *__first1)) + { + *__result = _GLIBCXX_MOVE(*__first2); + ++__first2; + } + else + { + *__result = _GLIBCXX_MOVE(*__first1); + ++__first1; + } + ++__result; + } + if (__first1 != __last1) + _GLIBCXX_MOVE3(__first1, __last1, __result); +} + + /// This is a helper function for the __merge_adaptive routines. templatetypename _BidirectionalIterator1, typename _BidirectionalIterator2, typename _BidirectionalIterator3 -_BidirectionalIterator3 -__move_merge_backward(_BidirectionalIterator1 __first1, - _BidirectionalIterator1 __last1, - _BidirectionalIterator2 __first2, - _BidirectionalIterator2 __last2, - _BidirectionalIterator3 __result) +void +__move_merge_adaptive_backward(_BidirectionalIterator1 __first1, + _BidirectionalIterator1 __last1, + _BidirectionalIterator2 __first2, + _BidirectionalIterator2 __last2, + _BidirectionalIterator3 __result) { if (__first1 == __last1) - return _GLIBCXX_MOVE_BACKWARD3(__first2, __last2, __result); - if (__first2 == __last2) - return _GLIBCXX_MOVE_BACKWARD3(__first1, __last1, __result); + { + _GLIBCXX_MOVE_BACKWARD3(__first2, __last2, __result); + return; + } + else if (__first2 == __last2) + return; + --__last1; --__last2; while (true) @@ -2738,34 +2794,41 @@ {
C++ PATCH for c++/44609 (printing an error for each step in infinite template recursion)
The PR complained about G++ getting into an infinite loop, but it isn't really infinite; the problem is that in the testcase a function template has an error and then depends on another instance of itself. I've fixed this for many cases by refusing to instantiate a declaration if there have been errors since beginning to instantiate the nearest enclosing declaration. This doesn't affect classes and constexpr variables/functions, because we can't just decide not to instantiate them without producing other errors. Tested x86_64-pc-linux-gnu, applying to trunk. commit af8514f2c47162f32f56d0ee3f18e9040d756b1f Author: Jason Merrill ja...@redhat.com Date: Mon Jul 11 09:38:11 2011 -0400 PR c++/44609 * cp-tree.h (struct tinst_level): Add errors field. * pt.c (neglectable_inst_p, limit_bad_template_recurson): New. (push_tinst_level): Don't start another decl in that case. (reopen_tinst_level): Adjust errors field. * decl2.c (cp_write_global_declarations): Don't complain about undefined inline if its template was defined. * mangle.c (mangle_decl_string): Handle failure from push_tinst_level. diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index 357295c..cc08640 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -4679,6 +4679,9 @@ struct GTY((chain_next (%h.next))) tinst_level { /* The location where the template is instantiated. */ location_t locus; + /* errorcount+sorrycount when we pushed this level. */ + int errors; + /* True if the location is in a system header. */ bool in_system_header_p; }; diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c index 8cd51c2..d90d4b5 100644 --- a/gcc/cp/decl2.c +++ b/gcc/cp/decl2.c @@ -3950,10 +3950,10 @@ cp_write_global_declarations (void) #pragma interface, etc.) we decided not to emit the definition here. */ !DECL_INITIAL (decl) - /* An explicit instantiation can be used to specify - that the body is in another unit. It will have - already verified there was a definition. */ - !DECL_EXPLICIT_INSTANTIATION (decl)) + /* Don't complain if the template was defined. */ + !(DECL_TEMPLATE_INSTANTIATION (decl) + DECL_INITIAL (DECL_TEMPLATE_RESULT +(template_for_substitution (decl) { warning (0, inline function %q+D used but never defined, decl); /* Avoid a duplicate warning from check_global_declaration_1. */ diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c index 81b772f..4a83c9a 100644 --- a/gcc/cp/mangle.c +++ b/gcc/cp/mangle.c @@ -3106,11 +3106,11 @@ mangle_decl_string (const tree decl) if (DECL_LANG_SPECIFIC (decl) DECL_USE_TEMPLATE (decl)) { struct tinst_level *tl = current_instantiation (); - if (!tl || tl-decl != decl) + if ((!tl || tl-decl != decl) + push_tinst_level (decl)) { template_p = true; saved_fn = current_function_decl; - push_tinst_level (decl); current_function_decl = NULL_TREE; } } diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index 2c64dd4..7c735ef 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -7499,6 +7499,36 @@ uses_template_parms_level (tree t, int level) /*include_nondeduced_p=*/true); } +/* Returns TRUE iff INST is an instantiation we don't need to do in an + ill-formed translation unit, i.e. a variable or function that isn't + usable in a constant expression. */ + +static inline bool +neglectable_inst_p (tree d) +{ + return (DECL_P (d) + !(TREE_CODE (d) == FUNCTION_DECL ? DECL_DECLARED_CONSTEXPR_P (d) + : decl_maybe_constant_var_p (d))); +} + +/* Returns TRUE iff we should refuse to instantiate DECL because it's + neglectable and instantiated from within an erroneous instantiation. */ + +static bool +limit_bad_template_recurson (tree decl) +{ + struct tinst_level *lev = current_tinst_level; + int errs = errorcount + sorrycount; + if (lev == NULL || errs == 0 || !neglectable_inst_p (decl)) +return false; + + for (; lev; lev = lev-next) +if (neglectable_inst_p (lev-decl)) + break; + + return (lev errs lev-errors); +} + static int tinst_depth; extern int max_tinst_depth; #ifdef GATHER_STATISTICS @@ -7532,9 +7562,16 @@ push_tinst_level (tree d) return 0; } + /* If the current instantiation caused problems, don't let it instantiate + anything else. Do allow deduction substitution and decls usable in + constant expressions. */ + if (limit_bad_template_recurson (d)) +return 0; + new_level = ggc_alloc_tinst_level (); new_level-decl = d; new_level-locus = input_location; + new_level-errors = errorcount+sorrycount; new_level-in_system_header_p = in_system_header; new_level-next = current_tinst_level; current_tinst_level = new_level; @@ -7578,6 +7615,8 @@ reopen_tinst_level (struct tinst_level *level) current_tinst_level = level; pop_tinst_level (); + if (current_tinst_level) +current_tinst_level-errors = errorcount+sorrycount; return level-decl; } diff --git
[Ada] Fix --enable-build-with-cxx build
This is an updated version of Laurent's patch originally here: http://gcc.gnu.org/ml/gcc/2009-06/msg00635.html Bootstrapped/regtested on x86_64-suse-linux with --enable-build-with-cxx. Arno, I think we should apply it. This isn't very intrusive in the end and with it we can go full C++ instead of requiring 3 compilers to bootstrap. 2011-07-11 Laurent GUERBY laur...@guerby.net Eric Botcazou ebotca...@adacore.com gnattools/ * Makefile.in (TOOLS_FLAGS_TO_PASS_1): Add LINKER. (TOOLS_FLAGS_TO_PASS_1re): Likewise. (TOOLS_FLAGS_TO_PASS_NATIVE): Likewise. (TOOLS_FLAGS_TO_PASS_CROSS): Likewise. gcc/ * prefix.h: Wrap up in extern C block. ada/ * adadecode.c: Likewise. * adadecode.h: Likewise. * adaint.c: Likewise. * adaint.h: Likewise. * argv.c: Likewise. * arit64.c: Likewise. * atree.h: Likewise. * aux-io.c: Likewise. * cal.c: Likewise. * cio.c: Likewise. * cstreams.c: Likewise. * ctrl_c.c: Likewise. * env.c: Likewise. * errno.c: Likewise. * exit.c: Likewise. * expect.c: Likewise. * fe.h: Likewise. * final.c: Likewise. * init.c: Likewise. * initialize.c: Likewise. * link.c: Likewise. * mkdir.c: Likewise. * namet.h: Likewise. * nlists.h: Likewise. * raise-gcc.c: Likewise. * raise.c: Likewise. * raise.h: Likewise. * repinfo.h: Likewise. * s-oscons-tmplt.c: Likewise. * seh_init.c: Likewise. * socket.c: Likewise. * sysdep.c: Likewise. * targext.c: Likewise. * tb-alvms.c: Likewise. * tb-alvxw.c: Likewise. * tb-gcc.c: Likewise. * tb-ivms.c: Likewise. * tracebak.c: Likewise. * uintp.h: Likewise. * urealp.h: Likewise. * vx_stack_info.c: Likewise. * xeinfo.adb: Wrap up generated C code in extern C block. * xsinfo.adb: Wrap up generated C code in extern C block. * xsnamest.adb: Wrap up generated C code in extern C block. * gcc-interface/gadaint.h: Wrap up in extern C block. * gcc-interface/gigi.h: Wrap up prototypes in extern C block. * gcc-interface/misc.c: Wrap up prototypes in extern C block. * gcc-interface/Make-lang.in (GCC_LINK): Use LINKER. * gcc-interface/Makefile.in (GCC_LINK): Likewise. -- Eric Botcazou Index: gnattools/Makefile.in === --- gnattools/Makefile.in (revision 176072) +++ gnattools/Makefile.in (working copy) @@ -67,6 +67,7 @@ ADA_INCLUDES_FOR_SUBDIR = -I. -I$(fsrcdi # Variables for gnattools1, native TOOLS_FLAGS_TO_PASS_1= \ CC=../../xgcc -B../../ \ + LINKER=$(CXX) \ CFLAGS=$(CFLAGS) $(WARN_CFLAGS) \ LDFLAGS=$(LDFLAGS) \ ADAFLAGS=$(ADAFLAGS) \ @@ -82,6 +83,7 @@ TOOLS_FLAGS_TO_PASS_1= \ # Variables for regnattools TOOLS_FLAGS_TO_PASS_1re= \ CC=../../xgcc -B../../ \ + LINKER=$(CXX) \ CFLAGS=$(CFLAGS) \ ADAFLAGS=$(ADAFLAGS) \ ADA_CFLAGS=$(ADA_CFLAGS) \ @@ -99,6 +101,7 @@ TOOLS_FLAGS_TO_PASS_1re= \ # Variables for gnattools2, native TOOLS_FLAGS_TO_PASS_NATIVE= \ CC=../../xgcc -B../../ \ + LINKER=$(CXX) \ CFLAGS=$(CFLAGS) \ ADAFLAGS=$(ADAFLAGS) \ ADA_CFLAGS=$(ADA_CFLAGS) \ @@ -115,6 +118,7 @@ TOOLS_FLAGS_TO_PASS_NATIVE= \ # Variables for gnattools, cross TOOLS_FLAGS_TO_PASS_CROSS= \ CC=$(CC) \ + LINKER=$(CXX) \ CFLAGS=$(CFLAGS) $(WARN_CFLAGS) \ LDFLAGS=$(LDFLAGS) \ ADAFLAGS=$(ADAFLAGS) \ Index: gcc/ada/adadecode.h === --- gcc/ada/adadecode.h (revision 176072) +++ gcc/ada/adadecode.h (working copy) @@ -29,6 +29,10 @@ * * / +#ifdef __cplusplus +extern C { +#endif + /* This function will return the Ada name from the encoded form. The Ada coding is done in exp_dbug.ads and this is the inverse function. see exp_dbug.ads for full encoding rules, a short description is added @@ -51,3 +55,7 @@ extern void get_encoding (const char *, function used in the binutils and GDB. Always consider using __gnat_decode instead of ada_demangle. Caller must free the pointer returned. */ extern char *ada_demangle (const char *); + +#ifdef __cplusplus +} +#endif Index: gcc/ada/sysdep.c === --- gcc/ada/sysdep.c (revision 176072) +++ gcc/ada/sysdep.c (working copy) @@ -30,7 +30,11 @@ / /* This file contains system dependent symbols that are referenced in the - GNAT Run Time Library */ + GNAT Run Time Library. */ + +#ifdef __cplusplus +extern C { +#endif #ifdef __vxworks #include ioLib.h @@ -1012,3
[dwarf2cfi] Cleanup interpretation of cfa.reg
Sometimes we compare cfa.reg with REGNO, and sometimes with something that has been passed through DWARF_FRAME_REGNUM. This leads to all sorts of confusion. I think that ideally we'd leave dw_cfa_location.reg in the GCC regno space, because that's convenient for the majority of the code that interprets rtl and turns it into CFIs. However, we have no inverse of DWARF_FRAME_REGNUM, which means that lookup_cfa_1 cannot read CFI data in dwarf2 regno space and produce an output in GCC regno space. Therefore, I've audited all uses of dw_cfa_location.reg to ensure that all references are in dwarf2 regno space. It would have been nice to be able to use C++ classes to be able to do this checking in perpetuity, but the equivalent struct wrapping in C would have made the source to ugly. Tested on x86_64-linux. Committed. r~ * dwarf2cfi.c (DW_STACK_POINTER_REGNUM): New. (DW_FRAME_POINTER_REGNUM): New. (expand_builtin_init_dwarf_reg_sizes): Use unsigned for rnum. (def_cfa_1): Do not convert reg to DWARF_FRAME_REGNUM here. (dwf_regno): New. (dwarf2out_flush_queued_reg_saves, dwarf2out_frame_debug_def_cfa, dwarf2out_frame_debug_adjust_cfa, dwarf2out_frame_debug_cfa_register, dwarf2out_frame_debug_cfa_expression, dwarf2out_frame_debug_expr): Use it. * dwarf2out.c (based_loc_descr): Use dwarf_frame_regnum. * dwarf2out.h (dwarf_frame_regnum): New. (struct cfa_loc): Document the domain of the reg member. diff --git a/gcc/dwarf2cfi.c b/gcc/dwarf2cfi.c index 5b8420e..1c76b3f 100644 --- a/gcc/dwarf2cfi.c +++ b/gcc/dwarf2cfi.c @@ -57,6 +57,10 @@ along with GCC; see the file COPYING3. If not see /* Maximum size (in bytes) of an artificially generated label. */ #define MAX_ARTIFICIAL_LABEL_BYTES 30 + +/* Short-hand for commonly used register numbers. */ +#define DW_STACK_POINTER_REGNUM dwarf_frame_regnum (STACK_POINTER_REGNUM) +#define DW_FRAME_POINTER_REGNUM dwarf_frame_regnum (HARD_FRAME_POINTER_REGNUM) /* A vector of call frame insns for the CIE. */ cfi_vec cie_cfi_vec; @@ -85,7 +89,7 @@ static void dwarf2out_frame_debug_restore_state (void); rtx expand_builtin_dwarf_sp_column (void) { - unsigned int dwarf_regnum = DWARF_FRAME_REGNUM (STACK_POINTER_REGNUM); + unsigned int dwarf_regnum = DW_STACK_POINTER_REGNUM; return GEN_INT (DWARF2_FRAME_REG_OUT (dwarf_regnum, 1)); } @@ -113,7 +117,7 @@ expand_builtin_init_dwarf_reg_sizes (tree address) for (i = 0; i FIRST_PSEUDO_REGISTER; i++) { - int rnum = DWARF2_FRAME_REG_OUT (DWARF_FRAME_REGNUM (i), 1); + unsigned int rnum = DWARF2_FRAME_REG_OUT (dwarf_frame_regnum (i), 1); if (rnum DWARF_FRAME_REGISTERS) { @@ -123,7 +127,7 @@ expand_builtin_init_dwarf_reg_sizes (tree address) if (HARD_REGNO_CALL_PART_CLOBBERED (i, save_mode)) save_mode = choose_hard_reg_mode (i, 1, true); - if (DWARF_FRAME_REGNUM (i) == DWARF_FRAME_RETURN_COLUMN) + if (dwarf_frame_regnum (i) == DWARF_FRAME_RETURN_COLUMN) { if (save_mode == VOIDmode) continue; @@ -415,8 +419,6 @@ def_cfa_1 (dw_cfa_location *loc_p) if (cfa_store.reg == loc.reg loc.indirect == 0) cfa_store.offset = loc.offset; - loc.reg = DWARF_FRAME_REGNUM (loc.reg); - /* If nothing changed, no need to issue any call frame instructions. */ if (cfa_equal_p (loc, old_cfa)) return; @@ -810,10 +812,10 @@ dwarf2out_args_size (HOST_WIDE_INT size) static void dwarf2out_stack_adjust (HOST_WIDE_INT offset) { - if (cfa.reg == STACK_POINTER_REGNUM) + if (cfa.reg == DW_STACK_POINTER_REGNUM) cfa.offset += offset; - if (cfa_store.reg == STACK_POINTER_REGNUM) + if (cfa_store.reg == DW_STACK_POINTER_REGNUM) cfa_store.offset += offset; if (ACCUMULATE_OUTGOING_ARGS) @@ -859,7 +861,7 @@ dwarf2out_notice_stack_adjust (rtx insn, bool after_p) /* If only calls can throw, and we have a frame pointer, save up adjustments until we see the CALL_INSN. */ - if (!flag_asynchronous_unwind_tables cfa.reg != STACK_POINTER_REGNUM) + if (!flag_asynchronous_unwind_tables cfa.reg != DW_STACK_POINTER_REGNUM) { if (CALL_P (insn) !after_p) { @@ -952,6 +954,16 @@ static GTY(()) VEC(reg_saved_in_data, gc) *regs_saved_in_regs; static GTY(()) reg_saved_in_data *cie_return_save; +/* Short-hand inline for the very common D_F_R (REGNO (x)) operation. */ +/* ??? This ought to go into dwarf2out.h alongside dwarf_frame_regnum, + except that dwarf2out.h is used in places where rtl is prohibited. */ + +static inline unsigned +dwf_regno (const_rtx reg) +{ + return dwarf_frame_regnum (REGNO (reg)); +} + /* Compare X and Y for equivalence. The inputs may be REGs or PC_RTX. */ static bool @@ -1031,9 +1043,9 @@ dwarf2out_flush_queued_reg_saves (void) if (q-reg == pc_rtx) reg = DWARF_FRAME_RETURN_COLUMN; else -reg =
libgo patch committed: Use abort, not std::abort, in C code
This patch changes std::abort() to abort() in C code. I'm not sure how this was working previously. Bootstrapped on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 3291a9609c87 libgo/runtime/go-unwind.c --- a/libgo/runtime/go-unwind.c Thu Jul 07 09:48:11 2011 -0700 +++ b/libgo/runtime/go-unwind.c Mon Jul 11 09:59:04 2011 -0700 @@ -293,7 +293,7 @@ break; default: - std::abort(); + abort(); } actions |= state _US_FORCE_UNWIND;
[gomp-3.1] const qualified vs. predetermination
Hi! The final standard dropped const qualified vars without mutable member back to being predetermined shared, but allows them to be specified in firstprivate clause (so that valid OpenMP 3.0 using default(none) aren't suddenly invalid). The following patch implements that. I'm not 100% sure about const qualified static data members or const qualified threadprivate, have asked about it on openmp forums, for the time being they aren't allowed in firstprivate clause. 2011-07-11 Jakub Jelinek ja...@redhat.com gcc/ * c-typeck.c (c_finish_omp_clauses): Don't complain about const qualified predetermined vars in firstprivate clause. Revert 2011-03-10 Jakub Jelinek ja...@redhat.com * c-typeck.c (c_finish_omp_clauses): Complain about TREE_READONLY decls in private, lastprivate and reduction clauses. gcc/c-family/ Revert 2011-03-10 Jakub Jelinek ja...@redhat.com * c-omp.c (c_omp_predetermined_sharing): Don't return OMP_CLAUSE_DEFAULT_SHARED for TREE_READONLY decls. gcc/cp/ * semantics.c (finish_omp_clauses): Don't complain about const qualified predetermined vars in firstprivate clause, unless it is a static data member. Revert 2011-03-10 Jakub Jelinek ja...@redhat.com * cp-gimplify.c (cxx_omp_predetermined_sharing): Don't return OMP_CLAUSE_DEFAULT_SHARED for decls with TYPE_READONLY type having no mutable member. * semantics.c (finish_omp_clauses): Complain about TREE_READONLY decls with no mutable member in private, lastprivate and reduction clauses. gcc/testsuite/ * g++.dg/gomp/private-1.C: Adjust for expected wording of error messages. Revert 2011-03-10 Jakub Jelinek ja...@redhat.com * gcc.dg/gomp/appendix-a/a.24.1.c: Adjust for const-qualified decls having no mutable members no longer being predetermined shared. * gcc.dg/gomp/sharing-1.c: Likewise. * gcc.dg/gomp/clause-1.c: Likewise. * g++.dg/gomp/sharing-1.C: Likewise. * g++.dg/gomp/clause-3.C: Likewise. * g++.dg/gomp/predetermined-1.C: Likewise. --- gcc/c-family/c-omp.c(revision 176179) +++ gcc/c-family/c-omp.c(working copy) @@ -601,7 +601,12 @@ c_split_parallel_clauses (location_t loc /* True if OpenMP sharing attribute of DECL is predetermined. */ enum omp_clause_default_kind -c_omp_predetermined_sharing (tree decl ATTRIBUTE_UNUSED) +c_omp_predetermined_sharing (tree decl) { + /* Variables with const-qualified type having no mutable member + are predetermined shared. */ + if (TREE_READONLY (decl)) +return OMP_CLAUSE_DEFAULT_SHARED; + return OMP_CLAUSE_DEFAULT_UNSPECIFIED; } --- gcc/cp/cp-gimplify.c(revision 176179) +++ gcc/cp/cp-gimplify.c(working copy) @@ -1372,6 +1372,8 @@ cxx_omp_privatize_by_reference (const_tr enum omp_clause_default_kind cxx_omp_predetermined_sharing (tree decl) { + tree type; + /* Static data members are predetermined as shared. */ if (TREE_STATIC (decl)) { @@ -1380,6 +1382,41 @@ cxx_omp_predetermined_sharing (tree decl return OMP_CLAUSE_DEFAULT_SHARED; } + type = TREE_TYPE (decl); + if (TREE_CODE (type) == REFERENCE_TYPE) +{ + if (!is_invisiref_parm (decl)) + return OMP_CLAUSE_DEFAULT_UNSPECIFIED; + type = TREE_TYPE (type); + + if (TREE_CODE (decl) == RESULT_DECL DECL_NAME (decl)) + { + /* NVR doesn't preserve const qualification of the +variable's type. */ + tree outer = outer_curly_brace_block (current_function_decl); + tree var; + + if (outer) + for (var = BLOCK_VARS (outer); var; var = DECL_CHAIN (var)) + if (DECL_NAME (decl) == DECL_NAME (var) + (TYPE_MAIN_VARIANT (type) + == TYPE_MAIN_VARIANT (TREE_TYPE (var + { + if (TYPE_READONLY (TREE_TYPE (var))) + type = TREE_TYPE (var); + break; + } + } +} + + if (type == error_mark_node) +return OMP_CLAUSE_DEFAULT_UNSPECIFIED; + + /* Variables with const-qualified type having no mutable member + are predetermined shared. */ + if (TYPE_READONLY (type) !cp_has_mutable_p (type)) +return OMP_CLAUSE_DEFAULT_SHARED; + return OMP_CLAUSE_DEFAULT_UNSPECIFIED; } --- gcc/cp/semantics.c (revision 176179) +++ gcc/cp/semantics.c (working copy) @@ -3966,7 +3966,6 @@ finish_omp_clauses (tree clauses) bool need_copy_ctor = false; bool need_copy_assignment = false; bool need_implicitly_determined = false; - bool no_const = false; tree type, inner_type; switch (c_kind) @@ -3980,7 +3979,6 @@ finish_omp_clauses (tree clauses) need_complete_non_reference = true; need_default_ctor = true; need_implicitly_determined =
Re: [PLUGIN] c-family files installation
On 07/11/2011 05:18 PM, Romain Geissler wrote: This patch add a new exception to the plugin header flattering strategy. c-family files can't be installed in the plugin include root directory as some other files like cp/cp-tree.h will look for them in the c-family directory. Furthermore, i had to correct an include in c-pretty-print.h so that it looks for c-common.h in the c-family directory. That way, headers will work out of the box when compiling a plugin, there is no need for additional include directory. Builds and installs fine Ok for the trunk (i have no write access) ? looks ok (but I cannot approve it). Almost the same patch submitted at http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01461.html, but this chunk unreviewed. Matthias
libgo patch committed: Define CC_FOR_BUILD in Makefile
This patch to libgo defines CC_FOR_BUILD in Makefile, to make it more likely to be able to build code in the libgo subdirectory. Bootstrapped on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 4732400182b5 libgo/configure.ac --- a/libgo/configure.ac Mon Jul 11 13:13:33 2011 -0700 +++ b/libgo/configure.ac Mon Jul 11 13:25:16 2011 -0700 @@ -42,6 +42,9 @@ AC_SUBST(enable_shared) AC_SUBST(enable_static) +CC_FOR_BUILD=${CC_FOR_BUILD:-gcc} +AC_SUBST(CC_FOR_BUILD) + WARN_FLAGS='-Wall -Wextra -Wwrite-strings -Wcast-qual' AC_SUBST(WARN_FLAGS)
RFC: attribute to reverse bitfield allocations
Finally getting around to writing this one. The idea is to have an attribute which determines how bitfields are allocated within words (lsb-first vs msb-first), assuming the programmer doesn't ask us to do something impossible. __attribute__((bitorder(FOO))) where FOO is: native (or omitted, or no attribute): no swapping lsb, msb: swap as needed to get the desired allocation order swapped: always swap First pass. Still missing: documentation, checks for overlapped bitfields after swapping. Is this approach acceptable? Note: the qsort is because the output function requires fields to be in bit-index order, but you can't sort them earlier or the constructors wouldn't match the fields. Index: c-family/c-common.c === --- c-family/c-common.c (revision 176083) +++ c-family/c-common.c (working copy) @@ -312,12 +312,13 @@ struct visibility_flags visibility_optio static tree c_fully_fold_internal (tree expr, bool, bool *, bool *); static tree check_case_value (tree); static bool check_case_bounds (tree, tree, tree *, tree *); static tree handle_packed_attribute (tree *, tree, tree, int, bool *); +static tree handle_bitorder_attribute (tree *, tree, tree, int, bool *); static tree handle_nocommon_attribute (tree *, tree, tree, int, bool *); static tree handle_common_attribute (tree *, tree, tree, int, bool *); static tree handle_noreturn_attribute (tree *, tree, tree, int, bool *); static tree handle_hot_attribute (tree *, tree, tree, int, bool *); static tree handle_cold_attribute (tree *, tree, tree, int, bool *); static tree handle_noinline_attribute (tree *, tree, tree, int, bool *); @@ -589,12 +590,14 @@ const unsigned int num_c_common_reswords const struct attribute_spec c_common_attribute_table[] = { /* { name, min_len, max_len, decl_req, type_req, fn_type_req, handler, affects_type_identity } */ { packed, 0, 0, false, false, false, handle_packed_attribute , false}, + { bitorder, 0, 1, false, true, false, + handle_bitorder_attribute , false}, { nocommon, 0, 0, true, false, false, handle_nocommon_attribute, false}, { common, 0, 0, true, false, false, handle_common_attribute, false }, /* FIXME: logically, noreturn attributes should be listed as false, true, true and apply to function types. But implementing this @@ -5764,12 +5767,42 @@ handle_packed_attribute (tree *node, tre *no_add_attrs = true; } return NULL_TREE; } +/* Handle a bitorder attribute; arguments as in + struct attribute_spec.handler. */ + +static tree +handle_bitorder_attribute (tree *ARG_UNUSED (node), tree ARG_UNUSED (name), + tree ARG_UNUSED (args), + int ARG_UNUSED (flags), bool *no_add_attrs) +{ + tree bmode; + const char *bname; + + /* Allow no arguments to mean native. */ + if (args == NULL_TREE) +return NULL_TREE; + + bmode = TREE_VALUE (args); + + bname = IDENTIFIER_POINTER (bmode); + if (strcmp (bname, msb) + strcmp (bname, lsb) + strcmp (bname, swapped) + strcmp (bname, native)) +{ + error (%qE is not a valid bitorder - use lsb, msb, native, or swapped, bmode); + *no_add_attrs = true; +} + + return NULL_TREE; +} + /* Handle a nocommon attribute; arguments as in struct attribute_spec.handler. */ static tree handle_nocommon_attribute (tree *node, tree name, tree ARG_UNUSED (args), Index: stor-layout.c === --- stor-layout.c (revision 176083) +++ stor-layout.c (working copy) @@ -1716,24 +1716,82 @@ finalize_type_size (tree type) TYPE_ALIGN (variant) = align; TYPE_USER_ALIGN (variant) = user_align; SET_TYPE_MODE (variant, mode); } } } + +static void +reverse_bitfield_layout (record_layout_info rli) +{ + tree field, oldtype; + for (field = TYPE_FIELDS (rli-t); field; field = TREE_CHAIN (field)) +{ + tree type = TREE_TYPE (field); + if (TREE_CODE (field) != FIELD_DECL) + continue; + if (TREE_CODE (field) == ERROR_MARK || TREE_CODE (type) == ERROR_MARK) + return; + oldtype = TREE_TYPE (DECL_FIELD_BIT_OFFSET (field)); + DECL_FIELD_BIT_OFFSET (field) + = size_binop (MINUS_EXPR, + size_binop (MINUS_EXPR, TYPE_SIZE (type), + DECL_SIZE (field)), + DECL_FIELD_BIT_OFFSET (field)); + TREE_TYPE (DECL_FIELD_BIT_OFFSET (field)) = oldtype; +} +} + +static int +reverse_bitfields_p (record_layout_info rli) +{ + tree st, arg; + const char *mode; + + st = rli-t; + + arg = lookup_attribute (bitorder, TYPE_ATTRIBUTES (st)); + + if (!arg) +return
Re: Use of vector instructions in memmov/memset expanding
Resending in plain text: On 11 July 2011 23:50, Michael Zolotukhin michael.v.zolotuk...@gmail.com wrote: The attached patch enables use of vector instructions in memmov/memset expanding. New algorithm for move-mode selection is implemented for move_by_pieces, store_by_pieces. x86-specific ix86_expand_movmem and ix86_expand_setmem are also changed in similar way, x86 cost-models parameters are slightly changed to support this. This implementation checks if array's alignment is known at compile time and chooses expanding algorithm and move-mode according to it. Bootstrapped, two new fails due to incorrect tests (see http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49503). New implementation gives quite big performance gain on memset/memcpy in some cases. A bunch of new tests are added to verify the implementation. Is it ok for trunk? Changelog: 2011-07-11 Zolotukhin Michael michael.v.zolotuk...@intel.com * config/i386/i386.h (processor_costs): Add second dimension to stringop_algs array. (clear_ratio): Tune value to improve performance. * config/i386/i386.c (cost models): Initialize second dimension of stringop_algs arrays. Tune cost model in atom_cost, generic32_cost and generic64_cost. (ix86_expand_move): Add support for vector moves, that use half of vector register. (expand_set_or_movmem_via_loop_with_iter): New function. (expand_set_or_movmem_via_loop): Enable reuse of the same iters in different loops, produced by this function. (emit_strset): New function. (promote_duplicated_reg): Add support for vector modes, add declaration. (promote_duplicated_reg_to_size): Likewise. (expand_movmem_epilogue): Add epilogue generation for bigger sizes. (expand_setmem_epilogue): Likewise. (expand_movmem_prologue): Likewise for prologue. (expand_setmem_prologue): Likewise. (expand_constant_movmem_prologue): Likewise. (expand_constant_setmem_prologue): Likewise. (decide_alg): Add new argument align_unknown. Fix algorithm of strategy selection if TARGET_INLINE_ALL_STRINGOPS is set. (decide_alignment): Update desired alignment according to chosen move mode. (ix86_expand_movmem): Change unrolled_loop strategy to use SSE-moves. (ix86_expand_setmem): Likewise. (ix86_slow_unaligned_access): Implementation of new hook slow_unaligned_access. (ix86_promote_rtx_for_memset): Implementation of new hook promote_rtx_for_memset. * config/i386/sse.md (sse2_loadq): Add expand for sse2_loadq. (vec_dupv4si): Add expand for vec_dupv4si. (vec_dupv2di): Add expand for vec_dupv2di. * emit-rtl.c (adjust_address_1): Improve algorithm for determining alignment of address+offset. (get_mem_align_offset): Add handling of MEM_REFs. * expr.c (compute_align_by_offset): New function. (move_by_pieces_insn): New function. (widest_mode_for_unaligned_mov): New function. (widest_mode_for_aligned_mov): New function. (widest_int_mode_for_size): Change type of size from int to HOST_WIDE_INT. (set_by_pieces_1): New function (new algorithm of memset expanding). (set_by_pieces_2): New function. (generate_move_with_mode): New function for set_by_pieces. (alignment_for_piecewise_move): Use hook slow_unaligned_access instead of macros SLOW_UNALIGNED_ACCESS. (emit_group_load_1): Likewise. (emit_group_store): Likewise. (emit_push_insn): Likewise. (store_field): Likewise. (expand_expr_real_1): Likewise. (compute_aligned_cost): New function. (compute_unaligned_cost): New function. (vector_mode_for_mode): New function. (vector_extensions_used_for_mode): New function. (move_by_pieces): New algorithm of memmove expanding. (move_by_pieces_ninsns): Update according to changes in move_by_pieces. (move_by_pieces_1): Remove as unused. (store_by_pieces): New algorithm for memset expanding. (clear_by_pieces): Likewise. (store_by_pieces_1): Remove incorrect parameters' attributes. * expr.h (compute_align_by_offset): Add declaration. * rtl.h (vector_extensions_used_for_mode): Add declaration. * builtins.c (expand_builtin_memset_args): Update according to changes in set_by_pieces. * target.def (DEFHOOK): Add hook slow_unaligned_access and promote_rtx_for_memset. * targhooks.c (default_slow_unaligned_access): Add default hook implementation. (default_promote_rtx_for_memset): Likewise. * targhooks.h (default_slow_unaligned_access): Add prototype. (default_promote_rtx_for_memset): Likewise. * cse.c (cse_insn): Stop forward propagation of vector constants. * fwprop.c (forward_propagate_and_simplify): Likewise. * doc/tm.texi (SLOW_UNALIGNED_ACCESS): Remove documentation for deleted macro SLOW_UNALIGNED_ACCESS. (TARGET_SLOW_UNALIGNED_ACCESS): Add documentation on
C++ PATCH for c++/49672 (ICE with variadic parms to lambda)
Note that this doesn't allow capture of a pack expansion yet, just fixes a hole in the patch for c++/48424. When instantiating a template function that has a non-pack parameter after a parameter pack, we were incorrectly treating it as part of the pack, leading to confusion. Tested x86_64-pc-linux-gnu, applying to trunk. I suppose I should also apply it to 4.6 since it has the earlier 48424 patch. commit 5a1ca9b80d38e86fc997289e0eb90f3bbc98ad0d Author: Jason Merrill ja...@redhat.com Date: Mon Jul 11 16:25:25 2011 -0400 PR c++/49672 * pt.c (extract_fnparm_pack): Split out from... (make_fnparm_pack): ...here. (instantiate_decl): Handle non-pack parms after a pack. * semantics.c (maybe_add_lambda_conv_op): Don't in a template. diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index 7c735ef..33b5b5f 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -8711,11 +8711,12 @@ tsubst_template_arg (tree t, tree args, tsubst_flags_t complain, tree in_decl) return r; } -/* Give a chain SPEC_PARM of PARM_DECLs, pack them into a - NONTYPE_ARGUMENT_PACK. */ +/* Given a function parameter pack TMPL_PARM and some function parameters + instantiated from it at *SPEC_P, return a NONTYPE_ARGUMENT_PACK of them + and set *SPEC_P to point at the next point in the list. */ static tree -make_fnparm_pack (tree spec_parm) +extract_fnparm_pack (tree tmpl_parm, tree *spec_p) { /* Collect all of the extra packed parameters into an argument pack. */ @@ -8723,11 +8724,18 @@ make_fnparm_pack (tree spec_parm) tree parmtypevec; tree argpack = make_node (NONTYPE_ARGUMENT_PACK); tree argtypepack = cxx_make_type (TYPE_ARGUMENT_PACK); - int i, len = list_length (spec_parm); + tree spec_parm = *spec_p; + int i, len; + + for (len = 0; spec_parm; ++len, spec_parm = TREE_CHAIN (spec_parm)) +if (tmpl_parm + !function_parameter_expanded_from_pack_p (spec_parm, tmpl_parm)) + break; /* Fill in PARMVEC and PARMTYPEVEC with all of the parameters. */ parmvec = make_tree_vec (len); parmtypevec = make_tree_vec (len); + spec_parm = *spec_p; for (i = 0; i len; i++, spec_parm = DECL_CHAIN (spec_parm)) { TREE_VEC_ELT (parmvec, i) = spec_parm; @@ -8738,9 +8746,19 @@ make_fnparm_pack (tree spec_parm) SET_ARGUMENT_PACK_ARGS (argpack, parmvec); SET_ARGUMENT_PACK_ARGS (argtypepack, parmtypevec); TREE_TYPE (argpack) = argtypepack; + *spec_p = spec_parm; return argpack; -} +} + +/* Give a chain SPEC_PARM of PARM_DECLs, pack them into a + NONTYPE_ARGUMENT_PACK. */ + +static tree +make_fnparm_pack (tree spec_parm) +{ + return extract_fnparm_pack (NULL_TREE, spec_parm); +} /* Substitute ARGS into T, which is an pack expansion (i.e. TYPE_PACK_EXPANSION or EXPR_PACK_EXPANSION). Returns a @@ -17830,21 +17848,21 @@ instantiate_decl (tree d, int defer_ok, spec_parm = skip_artificial_parms_for (d, spec_parm); tmpl_parm = skip_artificial_parms_for (subst_decl, tmpl_parm); } - while (tmpl_parm !FUNCTION_PARAMETER_PACK_P (tmpl_parm)) + for (; tmpl_parm; tmpl_parm = DECL_CHAIN (tmpl_parm)) { - register_local_specialization (spec_parm, tmpl_parm); - tmpl_parm = DECL_CHAIN (tmpl_parm); - spec_parm = DECL_CHAIN (spec_parm); + if (!FUNCTION_PARAMETER_PACK_P (tmpl_parm)) + { + register_local_specialization (spec_parm, tmpl_parm); + spec_parm = DECL_CHAIN (spec_parm); + } + else + { + /* Register the (value) argument pack as a specialization of + TMPL_PARM, then move on. */ + tree argpack = extract_fnparm_pack (tmpl_parm, spec_parm); + register_local_specialization (argpack, tmpl_parm); + } } - if (tmpl_parm FUNCTION_PARAMETER_PACK_P (tmpl_parm)) -{ - /* Register the (value) argument pack as a specialization of - TMPL_PARM, then move on. */ - tree argpack = make_fnparm_pack (spec_parm); - register_local_specialization (argpack, tmpl_parm); - tmpl_parm = DECL_CHAIN (tmpl_parm); - spec_parm = NULL_TREE; -} gcc_assert (!spec_parm); /* Substitute into the body of the function. */ diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c index 84b0dd8..fd00e29 100644 --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -8808,6 +8808,9 @@ maybe_add_lambda_conv_op (tree type) if (LAMBDA_EXPR_CAPTURE_LIST (CLASSTYPE_LAMBDA_EXPR (type)) != NULL_TREE) return; + if (processing_template_decl) +return; + stattype = build_function_type (TREE_TYPE (TREE_TYPE (callop)), FUNCTION_ARG_CHAIN (callop)); diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-variadic1.C b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-variadic1.C new file mode 100644 index 000..f17b336 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-variadic1.C @@ -0,0 +1,15 @@ +// PR c++/49672 +// { dg-options -std=c++0x } + +templatetypename ... Args +static void foo() +{ + [](Args..., int x) { +x; + }; +}
New automaton_option collapse-ndfa
On C6X, we want to use the ndfa option to give the scheduler maximum freedom when assigning units to instructions. After scheduling is complete, we process the insns again in c6x_reorg, looking at each cycle and assigning a unit specifier to each instruction so that there is no conflict within a cycle. This works, except for a few reservations that span more than the first cycle. To handle these properly, one possibility would be to consider the entire scheduled block with a more complicated algorithm. While feasible, I'd prefer not to go there for the moment. I came up with the notion of adding a new transition to the NDFA, one which collapses a nondeterministic state (which is composed of multiple possible deterministic ones) to just one of its component states. This can be done at the end of each cycle, and gives a state that can be processed with cpu_unit_reservation_p to identify the units chosen by the scheduler. The following patch implements this. The new option also modifies the generation of advance-cycle transitions so that they only exist in deterministic states. This matches the expected use of the feature where we have a collapse-ndfa transition before the end of each cycle (using the dfa_pre_cycle_insn hook). state_transition now recognizes const0_rtx as the collapse-ndfa transition (NULL_RTX was taken for advance-cycle). Tested with 4.5 c6x-elf so far. I hope to commit the C6X port to mainline soon and will retest the patch with that as well. IA64 is another user of the ndfa option, but it failed to bootstrap a clean tree when I tried it a few days ago, so I've only built a cross-cc1 and examined the generated insn-automata.c before/after the patch. No changes beyond slight expected reorganization in the code recognizing NULL_RTX as advance-cycle, and no changes in the ia64.dfa file generated with the v option. Ok? Bernd * doc/md.texi (automata_option): Document collapse-ndfa. * genautomata.c (COLLAPSE_OPTION): New macro. (collapse_flag): New static variable. (struct description): New member normal_decls_num. (struct automaton): New members advance_ainsn and collapse_ainsn. (gen_automata_option): Check for COLLAPSE_OPTION. (collapse_ndfa_insn_decl): New static variable. (add_collapse_ndfa_insn_decl, special_decl_p): New functions. (find_arc): If insn is the collapse-ndfa insn, accept any arc we find. (transform_insn_regexps): Call add_collapse_ndfa_insn_decl if necessary. Use normal_decls_num rather than decls_num, remove test for special decls. (create_alt_states, form_ainsn_with_same_reservs): Use special_decl_p. (make_automaton); Likewise. Use the new advance_cycle_insn member of struct automaton. (create_composed_state): Disallow advance-cycle arcs if collapse_flag is set. (NDFA_to_DFA): Don't create composed states for the collapse-ndfa transition. Create the necessary transitions for it. (create_ainsns): Return void. Take an automaton_t argument, and update its ainsn_list, advance_ainsn and collapse_ainsn members. All callers changed. (COLLAPSE_NDFA_VALUE_NAME): New macro. (output_tables): Output code to define it. (output_internal_insn_code_evaluation): Output code to accept const0_rtx as collapse-ndfa transition. (output_default_latencies, output_print_reservation_func, output_print_description): Reorganize loops to use normal_decls_num as loop bound; remove special case for advance_cycle_insn_decl. (initiate_automaton_gen): Handle COLLAPSE_OPTION. (check_automata_insn_issues): Check for collapse_ainsn. (expand_automate): Allocate sufficient space. Initialize normal_decls_num. Index: doc/md.texi === --- doc/md.texi (revision 176171) +++ doc/md.texi (working copy) @@ -7859,6 +7859,16 @@ nondeterministic treatment means trying may be rejected by reservations in the subsequent insns. @item +@dfn{collapse-ndfa} modifies the behaviour of the generator when +producing an automaton. An additional state transition to collapse a +nondeterministic @acronym{NDFA} state to a deterministic @acronym{DFA} +state is generated. It can be triggered by passing @code{const0_rtx} to +state_transition. In such an automaton, cycle advance transitions are +available only for these collapsed states. This option is useful for +ports that want to use the @code{ndfa} option, but also want to use +@code{define_query_cpu_unit} to assign units to insns issued in a cycle. + +@item @dfn{progress} means output of a progress bar showing how many states were generated so far for automaton being processed. This is useful during debugging a @acronym{DFA} description. If you see too many Index: genautomata.c
Re: RFC: attribute to reverse bitfield allocations
On Jul 11, 2011, at 1:52 PM, DJ Delorie wrote: Finally getting around to writing this one. The idea is to have an attribute which determines how bitfields are allocated :-) Apple has one of these sorts of creatures. You can see the code in the Apple tree, marked by APPLE LOCAL {begin ,end ,}bitfield reversal. Your code looks much nicer than the Apple code, I hope it works as well.
[pph] Do not call pushdecl_into_namespace to re-register symbols (issue4685053)
This patch changes the way we re-register symbols as they are read from the PPH image. Instead of calling pushdecl...(), we merge the global bindings from the PPH file (scope_chain-bindings) into the global bindings of the current translation unit. This fixes 3 name lookup failures in the testsuite: c2eabi1.cc, c2meteor-contest.cc and x1namespace.cc. It does produce a new failure in x1tmplclass.cc, which is addressed in the 3rd patch in this series. Tested on x86_64. Applied to branch. Diego. * pph-streamer-in.c (pph_register_decls_in_symtab): Rename from pph_add_bindings_to_namespace. Do not call pushdecl_into_namespace for every symbol. Just reset the scope for its identifier's namespace binding. (pph_in_scope_chain): Merge every field in struct cp_binding_level from the scope_chain-bindings coming from STREAM and the current scope_chain-bindings. testsuite/ChangeLog.pph * g++.dg/pph/c2eabi1.cc: Remove XFAIL markers. Expect an assembly difference. * g++.dg/pph/c2meteor-contest.cc: Likewise. * g++.dg/pph/x1namespace.cc: Mark fixed. diff --git a/gcc/cp/ChangeLog.pph b/gcc/cp/ChangeLog.pph index 37b8464..1011902 100644 --- a/gcc/cp/ChangeLog.pph +++ b/gcc/cp/ChangeLog.pph @@ -1,3 +1,13 @@ +2011-07-07 Diego Novillo dnovi...@google.com + + * pph-streamer-in.c (pph_register_decls_in_symtab): Rename + from pph_add_bindings_to_namespace. + Do not call pushdecl_into_namespace for every symbol. Just + reset the scope for its identifier's namespace binding. + (pph_in_scope_chain): Merge every field in struct + cp_binding_level from the scope_chain-bindings coming from + STREAM and the current scope_chain-bindings. + 2011-07-06 Diego Novillo dnovi...@google.com * pph-streamer-out.c (pph_out_scope_chain): Fix formatting. diff --git a/gcc/cp/pph-streamer-in.c b/gcc/cp/pph-streamer-in.c index 0bab93b..571ebf5 100644 --- a/gcc/cp/pph-streamer-in.c +++ b/gcc/cp/pph-streamer-in.c @@ -1139,38 +1139,32 @@ pph_in_lang_type (pph_stream *stream) } -/* Add all bindings declared in BL to NS. */ +/* Register all the symbols in binding level BL in the callgraph symbol + table. NS is the namespace where all the symbols in BL live. */ static void -pph_add_bindings_to_namespace (struct cp_binding_level *bl, tree ns) +pph_register_decls_in_symtab (struct cp_binding_level *bl, tree ns) { - tree t, chain; + tree t; /* The chains are built backwards (ref: add_decl_to_level), reverse them before putting them back in. */ bl-names = nreverse (bl-names); bl-namespaces = nreverse (bl-namespaces); - for (t = bl-names; t; t = chain) -{ - /* Pushing a decl into a scope clobbers its DECL_CHAIN. -Preserve it. */ - chain = DECL_CHAIN (t); - pushdecl_into_namespace (t, ns); + for (t = bl-names; t; t = DECL_CHAIN (t)) +if (DECL_NAME (t) IDENTIFIER_NAMESPACE_BINDINGS (DECL_NAME (t))) + { + cxx_binding *b = IDENTIFIER_NAMESPACE_BINDINGS (DECL_NAME (t)); + b-scope = NAMESPACE_LEVEL (ns); - if (TREE_CODE (t) == VAR_DECL TREE_STATIC (t) !DECL_EXTERNAL (t)) - varpool_finalize_decl (t); -} + if (TREE_CODE (t) == VAR_DECL TREE_STATIC (t) !DECL_EXTERNAL (t)) + varpool_finalize_decl (t); + } - for (t = bl-namespaces; t; t = chain) -{ - /* Pushing a decl into a scope clobbers its DECL_CHAIN. -Preserve it. */ - chain = DECL_CHAIN (t); - pushdecl_into_namespace (t, ns); - if (NAMESPACE_LEVEL (t)) - pph_add_bindings_to_namespace (NAMESPACE_LEVEL (t), t); -} + for (t = bl-namespaces; t; t = DECL_CHAIN (t)) +if (NAMESPACE_LEVEL (t)) + pph_register_decls_in_symtab (NAMESPACE_LEVEL (t), t); } @@ -1179,12 +1173,56 @@ pph_add_bindings_to_namespace (struct cp_binding_level *bl, tree ns) static void pph_in_scope_chain (pph_stream *stream) { - struct cp_binding_level *pph_bindings; + struct saved_scope *file_scope_chain; + unsigned i; + tree decl; + cp_class_binding *cb; + cp_label_binding *lb; + struct cp_binding_level *cur_bindings, *new_bindings; + + file_scope_chain = ggc_alloc_cleared_saved_scope (); + file_scope_chain-bindings = new_bindings = pph_in_binding_level (stream); + cur_bindings = scope_chain-bindings; + + pph_register_decls_in_symtab (new_bindings, global_namespace); + + /* Merge the bindings from STREAM into saved_scope-bindings. */ + chainon (cur_bindings-names, new_bindings-names); + chainon (cur_bindings-namespaces, new_bindings-namespaces); + + for (i = 0; VEC_iterate (tree, new_bindings-static_decls, i, decl); i++) +VEC_safe_push (tree, gc, cur_bindings-static_decls, decl); + + chainon (cur_bindings-usings, new_bindings-usings); + chainon (cur_bindings-using_directives, new_bindings-using_directives); + + for (i = 0; + VEC_iterate (cp_class_binding,
[pph] Add alternate addresses to register in the cache (issue4685054)
This patch adapts an idea from Gab that allow us to register alternate addresses in the cache. The problem here is making sure that symbols read from a PPH file reference the right bindings. If a symbol is in the global namespace when compiling a header file, its bindings will point to NAMESPACE_LEVEL(global_namespace)-bindings, but that global_namespace is the global_namespace instantiated for the header file. When reading that PPH image from a translation unit, we need to refer to the bindings of the *current* global_namespace. In general we solve this by inserting the pointer in the streamer cache. For instance, to avoid instantiating a second global_namespace decl, the initialization code of both the writer and the reader store global_namespace into the streaming cache. This way, all the references to global_namespace point to the current global_namespace as known by the writer and the reader. However, we cannot use the same trick on the bindings for global_namespace. If we simply inserted it into the cache then writing out NAMESPACE_LEVEL(global_namespace)-bindings would simply write a reference to the current one and on the reader side, it would simply restore a pointer to the current translation unit's bindings. Without ever actually writing or reading anything (since it was satisified from the cache). Therefore, we want a mechanism that allows the reader to: (a) read all the symbols in the global bindings, and (b) references to the global binding made by the symbols should point to the global bindings of the current translation unit (instead of the one in the PPH image). That's where ALLOC_AND_REGISTER_ALTERNATE comes in. When called, it allocates the data structure but registers another pointer in the cache. We use this trick when calling pph_in_binding_level from the toplevel: + new_bindings = pph_in_binding_level (stream, scope_chain-bindings); This way, when pph_in_binding_level tries to allocate the binding structure read from STREAM, it registers scope_chain-bindings in the cache. This way, references to the original file's global binding are automatically redirected to the current translation unit's global bindings. Gab, I modified your original implementation to move all the logic to the place where we need to make this decision. This way, it is easier to tell which functions need this alternate registration, instead of relying on some status flag squirreled away in the STREAM data structure. Tested on x86_64. Applied to branch. 2011-07-11 Diego Novillo dnovi...@google.com Gabriel Charette gch...@google.com * pph-streamer-in.c (ALLOC_AND_REGISTER_ALTERNATE): Define. (pph_in_binding_level): Add argument TO_REGISTER. Call ALLOC_AND_REGISTER_ALTERNATE if set. Update all users. (pph_register_decls_in_symtab): Call varpool_finalize_decl on all file-local symbols. (pph_in_scope_chain): Call pph_in_binding_level with scope_chain-bindings as the alternate pointer to register in the streaming cache. diff --git a/gcc/cp/ChangeLog.pph b/gcc/cp/ChangeLog.pph index 1011902..f18c2f4 100644 --- a/gcc/cp/ChangeLog.pph +++ b/gcc/cp/ChangeLog.pph @@ -1,3 +1,16 @@ +2011-07-11 Diego Novillo dnovi...@google.com +Gabriel Charette gch...@google.com + + * pph-streamer-in.c (ALLOC_AND_REGISTER_ALTERNATE): Define. + (pph_in_binding_level): Add argument TO_REGISTER. Call + ALLOC_AND_REGISTER_ALTERNATE if set. + Update all users. + (pph_register_decls_in_symtab): Call varpool_finalize_decl + on all file-local symbols. + (pph_in_scope_chain): Call pph_in_binding_level with + scope_chain-bindings as the alternate pointer to + register in the streaming cache. + 2011-07-07 Diego Novillo dnovi...@google.com * pph-streamer-in.c (pph_register_decls_in_symtab): Rename diff --git a/gcc/cp/pph-streamer-in.c b/gcc/cp/pph-streamer-in.c index 571ebf5..903cd94 100644 --- a/gcc/cp/pph-streamer-in.c +++ b/gcc/cp/pph-streamer-in.c @@ -42,6 +42,18 @@ along with GCC; see the file COPYING3. If not see pph_register_shared_data (STREAM, DATA, IX); \ } while (0) +/* Same as ALLOC_AND_REGISTER, but instead of registering DATA into the + cache at slot IX, it registers ALT_DATA. Used to support mapping + pointers to global data in the original STREAM that need to point + to a different instance when aggregating individual PPH files into + the current translation unit (see pph_in_binding_level for an + example). */ +#define ALLOC_AND_REGISTER_ALTERNATE(STREAM, IX, DATA, ALLOC_EXPR, ALT_DATA)\ +do { \ + (DATA) = (ALLOC_EXPR); \ + pph_register_shared_data (STREAM, ALT_DATA, IX); \ +} while (0) + /* Callback for unpacking value fields in ASTs. BP is the bitpack we are
[pph] Use FOR_EACH_VEC_ELT consistently (issue4673057)
No functional changes. Just tidying calls to VEC_iterate. Tested on x86_64. Committed to branch. Diego. * pph-streamer-in.c (pph_in_scope_chain): Replace VEC_iterate loops with FOR_EACH_VEC_ELT. (pph_read_file_contents): Likewise. * pph-streamer-out.c (pph_out_tree_vec): Likewise. (pph_out_qual_use_vec): Likewise. (pph_out_binding_level): Likewise. (pph_out_tree_pair_vec): Likewise. * pph-streamer.h (pph_out_tree_VEC): Likewise. diff --git a/gcc/cp/ChangeLog.pph b/gcc/cp/ChangeLog.pph index f18c2f4..46794d0 100644 --- a/gcc/cp/ChangeLog.pph +++ b/gcc/cp/ChangeLog.pph @@ -1,4 +1,15 @@ 2011-07-11 Diego Novillo dnovi...@google.com + + * pph-streamer-in.c (pph_in_scope_chain): Replace VEC_iterate + loops with FOR_EACH_VEC_ELT. + (pph_read_file_contents): Likewise. + * pph-streamer-out.c (pph_out_tree_vec): Likewise. + (pph_out_qual_use_vec): Likewise. + (pph_out_binding_level): Likewise. + (pph_out_tree_pair_vec): Likewise. + * pph-streamer.h (pph_out_tree_VEC): Likewise. + +2011-07-11 Diego Novillo dnovi...@google.com Gabriel Charette gch...@google.com * pph-streamer-in.c (ALLOC_AND_REGISTER_ALTERNATE): Define. diff --git a/gcc/cp/pph-streamer-in.c b/gcc/cp/pph-streamer-in.c index 903cd94..b40c384 100644 --- a/gcc/cp/pph-streamer-in.c +++ b/gcc/cp/pph-streamer-in.c @@ -1227,22 +1227,18 @@ pph_in_scope_chain (pph_stream *stream) chainon (cur_bindings-names, new_bindings-names); chainon (cur_bindings-namespaces, new_bindings-namespaces); - for (i = 0; VEC_iterate (tree, new_bindings-static_decls, i, decl); i++) + FOR_EACH_VEC_ELT (tree, new_bindings-static_decls, i, decl) VEC_safe_push (tree, gc, cur_bindings-static_decls, decl); chainon (cur_bindings-usings, new_bindings-usings); chainon (cur_bindings-using_directives, new_bindings-using_directives); - for (i = 0; - VEC_iterate (cp_class_binding, new_bindings-class_shadowed, i, cb); - i++) + FOR_EACH_VEC_ELT (cp_class_binding, new_bindings-class_shadowed, i, cb) VEC_safe_push (cp_class_binding, gc, cur_bindings-class_shadowed, cb); chainon (cur_bindings-type_shadowed, new_bindings-type_shadowed); - for (i = 0; - VEC_iterate (cp_label_binding, new_bindings-shadowed_labels, i, lb); - i++) + FOR_EACH_VEC_ELT (cp_label_binding, new_bindings-shadowed_labels, i, lb) VEC_safe_push (cp_label_binding, gc, cur_bindings-shadowed_labels, lb); chainon (cur_bindings-blocks, new_bindings-blocks); @@ -1412,14 +1408,14 @@ pph_read_file_contents (pph_stream *stream) keyed_classes = chainon (file_keyed_classes, keyed_classes); file_unemitted_tinfo_decls = pph_in_tree_vec (stream); - for (i = 0; VEC_iterate (tree, file_unemitted_tinfo_decls, i, t); i++) + FOR_EACH_VEC_ELT (tree, file_unemitted_tinfo_decls, i, t) VEC_safe_push (tree, gc, unemitted_tinfo_decls, t); file_static_aggregates = pph_in_tree (stream); static_aggregates = chainon (file_static_aggregates, static_aggregates); /* Expand all the functions with bodies that we read from STREAM. */ - for (i = 0; VEC_iterate (tree, stream-fns_to_expand, i, fndecl); i++) + FOR_EACH_VEC_ELT (tree, stream-fns_to_expand, i, fndecl) { /* FIXME pph - This is somewhat gross. When we generated the PPH image, the parser called expand_or_defer_fn on FNDECL, diff --git a/gcc/cp/pph-streamer-out.c b/gcc/cp/pph-streamer-out.c index 089bb13..f7bf739 100644 --- a/gcc/cp/pph-streamer-out.c +++ b/gcc/cp/pph-streamer-out.c @@ -483,7 +483,7 @@ pph_out_tree_vec (pph_stream *stream, VEC(tree,gc) *v, bool ref_p) tree t; pph_out_uint (stream, VEC_length (tree, v)); - for (i = 0; VEC_iterate (tree, v, i, t); i++) + FOR_EACH_VEC_ELT (tree, v, i, t) pph_out_tree_or_ref (stream, t, ref_p); } @@ -499,7 +499,7 @@ pph_out_qual_use_vec (pph_stream *stream, qualified_typedef_usage_t *q; pph_out_uint (stream, VEC_length (qualified_typedef_usage_t, v)); - for (i = 0; VEC_iterate (qualified_typedef_usage_t, v, i, q); i++) + FOR_EACH_VEC_ELT (qualified_typedef_usage_t, v, i, q) { pph_out_tree_or_ref (stream, q-typedef_decl, ref_p); pph_out_tree_or_ref (stream, q-context, ref_p); @@ -657,13 +657,13 @@ pph_out_binding_level (pph_stream *stream, struct cp_binding_level *bl, pph_out_chain_filtered (stream, bl-using_directives, ref_p, NO_BUILTINS); pph_out_uint (stream, VEC_length (cp_class_binding, bl-class_shadowed)); - for (i = 0; VEC_iterate (cp_class_binding, bl-class_shadowed, i, cs); i++) + FOR_EACH_VEC_ELT (cp_class_binding, bl-class_shadowed, i, cs) pph_out_class_binding (stream, cs, ref_p); pph_out_tree_or_ref (stream, bl-type_shadowed, ref_p); pph_out_uint (stream, VEC_length (cp_label_binding, bl-shadowed_labels)); - for (i = 0; VEC_iterate (cp_label_binding, bl-shadowed_labels, i, sl); i++) + FOR_EACH_VEC_ELT
New German PO file for 'gcc' (version 4.6.1)
Hello, gentle maintainer. This is a message from the Translation Project robot. A revised PO file for textual domain 'gcc' has been submitted by the German team of translators. The file is available at: http://translationproject.org/latest/gcc/de.po (This file, 'gcc-4.6.1.de.po', has just now been sent to you in a separate email.) All other PO files for your package are available in: http://translationproject.org/latest/gcc/ Please consider including all of these in your next release, whether official or a pretest. Whenever you have a new distribution with a new version number ready, containing a newer POT file, please send the URL of that distribution tarball to the address below. The tarball may be just a pretest or a snapshot, it does not even have to compile. It is just used by the translators when they need some extra translation context. The following HTML page has been updated: http://translationproject.org/domain/gcc.html If any question arises, please contact the translation coordinator. Thank you for all your work, The Translation Project robot, in the name of your translation coordinator. coordina...@translationproject.org
Re: PING: PATCH [8/n]: Prepare x32: PR other/48007: Unwind library doesn't work with UNITS_PER_WORD sizeof (void *)
Ping. On Wed, Jul 6, 2011 at 2:20 PM, H.J. Lu hjl.to...@gmail.com wrote: PING. On Thu, Jun 30, 2011 at 1:47 PM, H.J. Lu hjl.to...@gmail.com wrote: On Thu, Jun 30, 2011 at 12:02 PM, Richard Henderson r...@redhat.com wrote: On 06/30/2011 11:23 AM, H.J. Lu wrote: +#ifdef REG_VALUE_IN_UNWIND_CONTEXT +typedef _Unwind_Word _Unwind_Context_Reg_Val; +/* Signal frame context. */ +#define SIGNAL_FRAME_BIT ((_Unwind_Word) 1 0) There's absolutely no reason to re-define this. So what if the value is most-significant-bit set? Nor do I see any reason not to continue setting E_C_B. Done. +#define _Unwind_IsExtendedContext(c) 1 Why is this not still an inline function? It is defined before _Unwind_Context is declared. I used macros so that there can be one less #ifdef. + +static inline _Unwind_Word +_Unwind_Get_Unwind_Word (_Unwind_Context_Reg_Val val) +{ + return val; +} + +static inline _Unwind_Context_Reg_Val +_Unwind_Get_Unwind_Context_Reg_Val (_Unwind_Word val) +{ + return val; +} I cannot believe this actually works. I see nowhere that you copy the by-address slot out of the stack frame and place it into the by-value slot in the unwind context. I changed the implantation based on the feedback from Jason. Now I use the same reg field for both value and address. /* This will segfault if the register hasn't been saved. */ if (size == sizeof(_Unwind_Ptr)) - return * (_Unwind_Ptr *) ptr; + return * (_Unwind_Ptr *) (_Unwind_Internal_Ptr) val; else { gcc_assert (size == sizeof(_Unwind_Word)); - return * (_Unwind_Word *) ptr; + return * (_Unwind_Word *) (_Unwind_Internal_Ptr) val; } Indeed, this section is both wrong and belies the change you purport to make. You didn't even test this, did you? Here is the updated patch. It works on simple tests. I am running full tests. I kept config/i386/value-unwind.h since libgcc/md-unwind-support.h is included too late in unwind-dw2.c and I don't want to move it to be on the safe side. OK for trunk? Thanks. -- H.J. --- gcc/ 2011-06-30 H.J. Lu hongjiu...@intel.com * config.gcc (libgcc_tm_file): Add i386/value-unwind.h for Linux/x86. * system.h (REG_VALUE_IN_UNWIND_CONTEXT): Poisoned. * unwind-dw2.c (_Unwind_Context_Reg_Val): New. (_Unwind_Get_Unwind_Word): Likewise. (_Unwind_Get_Unwind_Context_Reg_Val): Likewise. (_Unwind_Context): Use _Unwind_Context_Reg_Val on the reg field. (_Unwind_IsExtendedContext): Defined as macro. (_Unwind_GetGR): Updated. (_Unwind_SetGR): Likewise. (_Unwind_GetGRPtr): Likewise. (_Unwind_SetGRPtr): Likewise. (_Unwind_SetGRValue): Likewise. (_Unwind_GRByValue): Likewise. (__frame_state_for): Likewise. (uw_install_context_1): Likewise. * doc/tm.texi.in: Document REG_VALUE_IN_UNWIND_CONTEXT. * doc/tm.texi: Regenerated. libgcc/ 2011-06-30 H.J. Lu hongjiu...@intel.com * config/i386/value-unwind.h: New. -- H.J. -- H.J.
[PATCH] [Annotalysis] Fix to get_canonical_lock_expr
This patch fixes get_canonical_lock_expr so that it works on lock expressions that involve a MEM_REF. Gimple code can use either MEM_REF or INDIRECT_REF in many expressions, and the choice of which to use is somewhat arbitrary. The canonical form of a lock expression must rewrite all MEM_REFs to INDIRECT_REFs to accurately compare expressions. The surrounding if block prevented this rewrite from happening in certain cases. Bootstrapped and passed GCC regression testsuite on x86_64-unknown-linux-gnu. Okay for branches/annotalysis and google/main? -DeLesley 2011-07-06 DeLesley Hutchins deles...@google.com * cp_get_virtual_function_decl.c (handle_call_gs): Changes function to return null if the method cannot be found. * thread_annot_lock-79.C: Additional annotalysis test cases Index: gcc/tree-threadsafe-analyze.c === --- gcc/tree-threadsafe-analyze.c (revision 176188) +++ gcc/tree-threadsafe-analyze.c (working copy) @@ -959,19 +959,17 @@ get_canonical_lock_expr (tree lock, tree base_obj, tree canon_base = get_canonical_lock_expr (base, base_obj, true /* is_temp_expr */, new_leftmost_base_var); - if (base != canon_base) -{ - /* If CANON_BASE is an ADDR_EXPR (e.g. a), doing an indirect or - memory reference on top of it is equivalent to accessing the - variable itself. That is, *(a) == a. So if that's the case, - simply return the variable. Otherwise, build an indirect ref - expression. */ - if (TREE_CODE (canon_base) == ADDR_EXPR) -lock = TREE_OPERAND (canon_base, 0); - else -lock = build1 (INDIRECT_REF, - TREE_TYPE (TREE_TYPE (canon_base)), canon_base); -} + + /* If CANON_BASE is an ADDR_EXPR (e.g. a), doing an indirect or + memory reference on top of it is equivalent to accessing the + variable itself. That is, *(a) == a. So if that's the case, + simply return the variable. Otherwise, build an indirect ref + expression. */ + if (TREE_CODE (canon_base) == ADDR_EXPR) +lock = TREE_OPERAND (canon_base, 0); + else +lock = build1 (INDIRECT_REF, + TREE_TYPE (TREE_TYPE (canon_base)), canon_base); break; } default: -- DeLesley Hutchins | Software Engineer | deles...@google.com | 505-206-0315
Re: Add __builtin_clrsb, similar to clz/ctz
On Mon, 20 Jun 2011, Bernd Schmidt wrote: New patch below. Retested on i686 and bfin. Yay, bikeshedding opportunity! :P Can we call them leading *repeated* sign bits? (in docs and comments) Calling them redundant makes you think the representation is not two's complement but new and improved... like (bitwise) (131) == -1 or something. brgds, H-P PS. just a minor change, I can do the legwork.
Re: Use of vector instructions in memmov/memset expanding
On Mon, Jul 11, 2011 at 1:57 PM, Michael Zolotukhin michael.v.zolotuk...@gmail.com wrote: Sorry, for sending once again - forgot to attach the patch. On 11 July 2011 23:50, Michael Zolotukhin michael.v.zolotuk...@gmail.com wrote: The attached patch enables use of vector instructions in memmov/memset expanding. New algorithm for move-mode selection is implemented for move_by_pieces, store_by_pieces. x86-specific ix86_expand_movmem and ix86_expand_setmem are also changed in similar way, x86 cost-models parameters are slightly changed to support this. This implementation checks if array's alignment is known at compile time and chooses expanding algorithm and move-mode according to it. Bootstrapped, two new fails due to incorrect tests (see http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49503). New implementation gives quite big performance gain on memset/memcpy in some cases. A bunch of new tests are added to verify the implementation. Is it ok for trunk? Changelog: 2011-07-11 Zolotukhin Michael michael.v.zolotuk...@intel.com * config/i386/i386.h (processor_costs): Add second dimension to stringop_algs array. (clear_ratio): Tune value to improve performance. * config/i386/i386.c (cost models): Initialize second dimension of stringop_algs arrays. Tune cost model in atom_cost, generic32_cost and generic64_cost. (ix86_expand_move): Add support for vector moves, that use half of vector register. (expand_set_or_movmem_via_loop_with_iter): New function. (expand_set_or_movmem_via_loop): Enable reuse of the same iters in different loops, produced by this function. (emit_strset): New function. (promote_duplicated_reg): Add support for vector modes, add declaration. (promote_duplicated_reg_to_size): Likewise. (expand_movmem_epilogue): Add epilogue generation for bigger sizes. (expand_setmem_epilogue): Likewise. (expand_movmem_prologue): Likewise for prologue. (expand_setmem_prologue): Likewise. (expand_constant_movmem_prologue): Likewise. (expand_constant_setmem_prologue): Likewise. (decide_alg): Add new argument align_unknown. Fix algorithm of strategy selection if TARGET_INLINE_ALL_STRINGOPS is set. (decide_alignment): Update desired alignment according to chosen move mode. (ix86_expand_movmem): Change unrolled_loop strategy to use SSE-moves. (ix86_expand_setmem): Likewise. (ix86_slow_unaligned_access): Implementation of new hook slow_unaligned_access. (ix86_promote_rtx_for_memset): Implementation of new hook promote_rtx_for_memset. * config/i386/sse.md (sse2_loadq): Add expand for sse2_loadq. (vec_dupv4si): Add expand for vec_dupv4si. (vec_dupv2di): Add expand for vec_dupv2di. * emit-rtl.c (adjust_address_1): Improve algorithm for determining alignment of address+offset. (get_mem_align_offset): Add handling of MEM_REFs. * expr.c (compute_align_by_offset): New function. (move_by_pieces_insn): New function. (widest_mode_for_unaligned_mov): New function. (widest_mode_for_aligned_mov): New function. (widest_int_mode_for_size): Change type of size from int to HOST_WIDE_INT. (set_by_pieces_1): New function (new algorithm of memset expanding). (set_by_pieces_2): New function. (generate_move_with_mode): New function for set_by_pieces. (alignment_for_piecewise_move): Use hook slow_unaligned_access instead of macros SLOW_UNALIGNED_ACCESS. (emit_group_load_1): Likewise. (emit_group_store): Likewise. (emit_push_insn): Likewise. (store_field): Likewise. (expand_expr_real_1): Likewise. (compute_aligned_cost): New function. (compute_unaligned_cost): New function. (vector_mode_for_mode): New function. (vector_extensions_used_for_mode): New function. (move_by_pieces): New algorithm of memmove expanding. (move_by_pieces_ninsns): Update according to changes in move_by_pieces. (move_by_pieces_1): Remove as unused. (store_by_pieces): New algorithm for memset expanding. (clear_by_pieces): Likewise. (store_by_pieces_1): Remove incorrect parameters' attributes. * expr.h (compute_align_by_offset): Add declaration. * rtl.h (vector_extensions_used_for_mode): Add declaration. * builtins.c (expand_builtin_memset_args): Update according to changes in set_by_pieces. * target.def (DEFHOOK): Add hook slow_unaligned_access and promote_rtx_for_memset. * targhooks.c (default_slow_unaligned_access): Add default hook implementation. (default_promote_rtx_for_memset): Likewise. * targhooks.h (default_slow_unaligned_access): Add prototype. (default_promote_rtx_for_memset): Likewise. * cse.c (cse_insn): Stop forward propagation of vector constants. * fwprop.c (forward_propagate_and_simplify): Likewise. * doc/tm.texi (SLOW_UNALIGNED_ACCESS): Remove
re: Fix argument pushes to unaligned stack slots
hi folks. i'm having a problem with GCC 4.5.3 on netbsd-m68k target. i've tracked it down to this change from several years ago: 2007-02-06 Joseph Myers jos...@codesourcery.com * expr.c (emit_push_insn): If STRICT_ALIGNMENT, copy to an unaligned stack slot via a suitably aligned slot. the problem is that emit_library_call_value_1() calls emit_push_insn() with TYPE_NULL which ends up triggering a NULL deref when emit_push_insn() calls assign_temp() with type = TYPE_NULL, and assign_temp() crashes. this simple change seems to be sufficient to avoid the crash and the generated code appears to run OK. if it is OK, could someone please commit it? thanks. (feel free to update my log message if it could be clearer or more correct.) .mrg. 2011-07-10 matthew green m...@eterna.com.au * expr.c (emit_push_insn): Don't copy a TYPE_NULL expression to the stack for correct alignment. Index: external/gpl3/gcc/dist/gcc/expr.c === RCS file: /cvsroot/src/external/gpl3/gcc/dist/gcc/expr.c,v retrieving revision 1.1.1.1 diff -p -u -r1.1.1.1 expr.c --- external/gpl3/gcc/dist/gcc/expr.c 21 Jun 2011 01:20:17 - 1.1.1.1 +++ external/gpl3/gcc/dist/gcc/expr.c 12 Jul 2011 04:17:00 - @@ -3764,7 +3764,8 @@ emit_push_insn (rtx x, enum machine_mode xinner = x; if (mode == BLKmode - || (STRICT_ALIGNMENT align GET_MODE_ALIGNMENT (mode))) + || (STRICT_ALIGNMENT align GET_MODE_ALIGNMENT (mode) + type != NULL_TREE)) { /* Copy a block into the stack, entirely or partially. */