Re: [RFC] Do not consider volatile asms as optimization barriers #1
Eric Botcazou ebotca...@adacore.com writes: Thanks, and to Bernd for the review. I went ahead and applied it to trunk. Thanks. We need something for the 4.8 branch as well, probably the builtins.c hunk and the reversion of the cse.c/cselib.c/dse.c changes to the 4.7 state. OK, how about this? It looks like the builtins.c and stmt.c stuff wasn't merged until 4.9, and at this stage it seemed safer to just add the same use/clobber sequence to both places. Tested on x86_64-linux-gnu. OK to install? Thanks, Richard gcc/ * builtins.c (expand_builtin_setjmp_receiver): Emit a use of the hard frame pointer. Synchronize commentary with mainline. * cse.c (cse_insn): Only check for volatile asms. * cselib.c (cselib_process_insn): Likewise. * dse.c (scan_insn): Likewise. * stmt.c (expand_nl_goto_receiver): Emit a use and a clobber of the hard frame pointer. Index: gcc/builtins.c === --- gcc/builtins.c 2014-03-12 18:24:02.919132339 + +++ gcc/builtins.c 2014-03-12 18:24:17.679262346 + @@ -905,9 +905,24 @@ expand_builtin_setjmp_receiver (rtx rece if (! HAVE_nonlocal_goto) #endif { + /* First adjust our frame pointer to its actual value. It was +previously set to the start of the virtual area corresponding to +the stacked variables when we branched here and now needs to be +adjusted to the actual hardware fp value. + +Assignments to virtual registers are converted by +instantiate_virtual_regs into the corresponding assignment +to the underlying register (fp in this case) that makes +the original assignment true. +So the following insn will actually be decrementing fp by +STARTING_FRAME_OFFSET. */ emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx); - /* This might change the hard frame pointer in ways that aren't -apparent to early optimization passes, so force a clobber. */ + + /* Restoring the frame pointer also modifies the hard frame pointer. +Mark it used (so that the previous assignment remains live once +the frame pointer is eliminated) and clobbered (to represent the +implicit update from the assignment). */ + emit_use (hard_frame_pointer_rtx); emit_clobber (hard_frame_pointer_rtx); } @@ -948,8 +963,7 @@ expand_builtin_setjmp_receiver (rtx rece /* We must not allow the code we just generated to be reordered by scheduling. Specifically, the update of the frame pointer must - happen immediately, not later. Similarly, we must block - (frame-related) register values to be used across this code. */ + happen immediately, not later. */ emit_insn (gen_blockage ()); } Index: gcc/cse.c === --- gcc/cse.c 2014-03-12 18:24:02.919132339 + +++ gcc/cse.c 2014-03-12 18:24:17.680262355 + @@ -5659,9 +5659,10 @@ cse_insn (rtx insn) invalidate (XEXP (dest, 0), GET_MODE (dest)); } - /* A volatile ASM or an UNSPEC_VOLATILE invalidates everything. */ + /* A volatile ASM invalidates everything. */ if (NONJUMP_INSN_P (insn) - volatile_insn_p (PATTERN (insn))) + GET_CODE (PATTERN (insn)) == ASM_OPERANDS + MEM_VOLATILE_P (PATTERN (insn))) flush_hash_table (); /* Don't cse over a call to setjmp; on some machines (eg VAX) Index: gcc/cselib.c === --- gcc/cselib.c2014-03-12 18:24:02.919132339 + +++ gcc/cselib.c2014-03-12 18:24:17.681262364 + @@ -2623,12 +2623,13 @@ cselib_process_insn (rtx insn) cselib_current_insn = insn; - /* Forget everything at a CODE_LABEL, a volatile insn, or a setjmp. */ + /* Forget everything at a CODE_LABEL, a volatile asm, or a setjmp. */ if ((LABEL_P (insn) || (CALL_P (insn) find_reg_note (insn, REG_SETJMP, NULL)) || (NONJUMP_INSN_P (insn) - volatile_insn_p (PATTERN (insn + GET_CODE (PATTERN (insn)) == ASM_OPERANDS + MEM_VOLATILE_P (PATTERN (insn !cselib_preserve_constants) { cselib_reset_table (next_uid); Index: gcc/dse.c === --- gcc/dse.c 2014-03-12 18:24:02.919132339 + +++ gcc/dse.c 2014-03-12 18:24:17.681262364 + @@ -2518,7 +2518,8 @@ scan_insn (bb_info_t bb_info, rtx insn) /* Cselib clears the table for this case, so we have to essentially do the same. */ if (NONJUMP_INSN_P (insn) - volatile_insn_p (PATTERN (insn))) + GET_CODE (PATTERN (insn)) == ASM_OPERANDS + MEM_VOLATILE_P (PATTERN (insn))) { add_wild_read (bb_info); insn_info-cannot_delete = true; Index: gcc/stmt.c
Re: [RFC] Do not consider volatile asms as optimization barriers #1
On Thu, Mar 13, 2014 at 07:15:34AM +, Richard Sandiford wrote: Eric Botcazou ebotca...@adacore.com writes: Thanks, and to Bernd for the review. I went ahead and applied it to trunk. Thanks. We need something for the 4.8 branch as well, probably the builtins.c hunk and the reversion of the cse.c/cselib.c/dse.c changes to the 4.7 state. OK, how about this? It looks like the builtins.c and stmt.c stuff wasn't merged until 4.9, and at this stage it seemed safer to just add the same use/clobber sequence to both places. Please wait a little bit, the patch has been committed to the trunk only very recently, we want to see if it has any fallout. Jakub
Re: [gomp4] Accelerator constructs omp lowering and expansion
Hi! On Wed, 12 Mar 2014 14:48:03 +0100, I wrote: On Wed, 4 Sep 2013 20:54:47 +0200, Jakub Jelinek ja...@redhat.com wrote: This patch implements #pragma omp {target{, data, update},teams} lowering and expansion, and adds stub calls into libgomp, so that (for now unconditionally) we can at least always fall back to host execution. 2013-09-04 Jakub Jelinek ja...@redhat.com * omp-low.c [...] (create_omp_child_function): If current function has omp declare target attribute or if current region is OMP_TARGET or lexically nested in it, add that attribute to the omp child function. It seems that I have missed this one when generalizing the existing code for OpenACC: [...] Even if not yet relevant at the moment for OpenACC, I think it makes sense to make it more obvious, and change the code as follows. Will commit soon unless someone disagrees. Committed to gomp-4_0-branch in r208531: commit d50387c6b64d888e2acf12088979e6147bdaccc9 Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Thu Mar 13 07:53:48 2014 + Properly detect all offloaded regions. gcc/ * omp-low.c (create_omp_child_function): Use is_gimple_omp_offloaded when looking for offloaded regions. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@208531 138bc75d-0d04-0410-961f-82ee72b054a4 diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp index 4ee843f..5fb4657 100644 --- gcc/ChangeLog.gomp +++ gcc/ChangeLog.gomp @@ -1,3 +1,8 @@ +2014-03-13 Thomas Schwinge tho...@codesourcery.com + + * omp-low.c (create_omp_child_function): Use + is_gimple_omp_offloaded when looking for offloaded regions. + 2014-03-12 Thomas Schwinge tho...@codesourcery.com * omp-low.c (scan_sharing_clauses): Move offloaded logic into... diff --git gcc/omp-low.c gcc/omp-low.c index 6b676e5..aa2dd32 100644 --- gcc/omp-low.c +++ gcc/omp-low.c @@ -1978,16 +1978,12 @@ create_omp_child_function (omp_context *ctx, bool task_copy) { omp_context *octx; for (octx = ctx; octx; octx = octx-outer) - if (gimple_code (octx-stmt) == GIMPLE_OMP_TARGET -gimple_omp_target_kind (octx-stmt) - == GF_OMP_TARGET_KIND_REGION) + if (is_gimple_omp_offloaded (octx-stmt)) { target_p = true; break; } } - gcc_assert (!is_gimple_omp_oacc_specifically (ctx-stmt) - || !target_p); if (target_p) DECL_ATTRIBUTES (decl) = tree_cons (get_identifier (omp declare target), Grüße, Thomas pgpXkcy1cagp4.pgp Description: PGP signature
Re: [patch,avr] Fix PR60486: Typo cc_plus against cc_minus in calls of avr_out_plus_1
2014-03-12 17:35 GMT+04:00 Georg-Johann Lay a...@gjlay.de: This fixes a problem because cc_plus and cc_minus are in the wrong places in calls of avr_out_plus_1. This is important when avr_out_plus is called from notice_update_cc. This means that cc_status might be determined incorrectly. In the vast majority of cases this leads to performance regression because of superfluous comparisons when an addition (using SUB instructions) has already set the condition code. But there are also cases where this might lead to wrong code. No changes in test suite results. Ok to apply? I didn't follow this list for some time. Is trunk open for such changes? If so, I would apply it to trunk and 4.8 branch, otherwise to 4.8, 4.9 and trunk once they are open again. Please apply. As I remember the trunk always open for port specific changes. Denis.
RFC LeakSanitizer tests.
Hi, This patch adds initial set of tests and dg infrastructure for LeakSanitizer runtime. Tested on x86_64. Ok to commit? -Maxim 2014-03-13 Max Ostapenko m.ostape...@partner.samsung.com * c-c++-common/lsan/fork.c: New test. * c-c++-common/lsan/ignore_object.c: New test. * c-c++-common/lsan/ignore_object_errors.c: New test. * c-c++-common/lsan/large_allocation_leak.c: New test. * c-c++-common/lsan/leak_check_at_exit-1.c: New test. * c-c++-common/lsan/leak_check_at_exit-2.c: New test. * c-c++-common/lsan/link_turned_off.c: New test. * c-c++-common/lsan/pointer_to_self.c: New test. * c-c++-common/lsan/suppressions_default.c: New test. * c-c++-common/lsan/swapcontext-1.c: New test. * c-c++-common/lsan/swapcontext-2.c: New test. * c-c++-common/lsan/use_after_return.c: New test. * c-c++-common/lsan/use_globals_initialized.c: New test. * c-c++-common/lsan/use_globals_uninitialized.c: New test. * c-c++-common/lsan/use_stacks.c: New test. * c-c++-common/lsan/use_tls_static.c: New test. * c-c++-common/lsan/use_unaligned.c: New test. * g++.dg/lsan/lsan.exp: New file. * gcc.dg/lsan/lsan.exp: New file. * lib/lsan-dg.exp: New file. diff --git a/gcc/testsuite/c-c++-common/lsan/fork.c b/gcc/testsuite/c-c++-common/lsan/fork.c new file mode 100644 index 000..4dc9d4b --- /dev/null +++ b/gcc/testsuite/c-c++-common/lsan/fork.c @@ -0,0 +1,23 @@ +// Test that thread local data is handled correctly after forking without exec(). +/* { dg-do run } */ + +#include assert.h +#include stdio.h +#include stdlib.h +#include sys/wait.h +#include unistd.h + +__thread void *thread_local_var; + +int main() { + int status = 0; + thread_local_var = malloc(1337); + pid_t pid = fork(); + assert(pid = 0); + if (pid 0) { +waitpid(pid, status, 0); +assert(WIFEXITED(status)); +return WEXITSTATUS(status); + } + return 0; +} diff --git a/gcc/testsuite/c-c++-common/lsan/ignore_object.c b/gcc/testsuite/c-c++-common/lsan/ignore_object.c new file mode 100644 index 000..d73f08b --- /dev/null +++ b/gcc/testsuite/c-c++-common/lsan/ignore_object.c @@ -0,0 +1,31 @@ +// Test for __lsan_ignore_object(). + +/* { dg-do run } */ +/* { dg-set-target-env-var LSAN_OPTIONS report_objects=1:use_registers=0:use_stacks=0:use_globals=0:use_tls=0:verbosity=3 } */ +/* { dg-set-target-env-var ASAN_OPTIONS verbosity=3 } */ +/* { dg-shouldfail lsan } */ + +#include stdio.h +#include stdlib.h + +#ifdef __cplusplus +extern C +#endif +void __lsan_ignore_object(void **p); + +int main() { + // Explicitly ignored object. + void **p = (void **) malloc(sizeof(void **)); + // Transitively ignored object. + *p = malloc(666); + // Non-ignored object. + volatile void *q = malloc(1337); + fprintf(stderr, Test alloc: %p.\n, p); + fprintf(stderr, Test alloc_2: %p.\n, q); + __lsan_ignore_object(p); + return 0; +} + +/* { dg-output Test alloc: .* } */ +/* { dg-output ignoring heap object at .* } */ +/* { dg-output SUMMARY: (Leak|Address)Sanitizer: 1337 byte\\(s\\) leaked in 1 allocation\\(s\\).* } */ diff --git a/gcc/testsuite/c-c++-common/lsan/ignore_object_errors.c b/gcc/testsuite/c-c++-common/lsan/ignore_object_errors.c new file mode 100644 index 000..47a1cd1 --- /dev/null +++ b/gcc/testsuite/c-c++-common/lsan/ignore_object_errors.c @@ -0,0 +1,25 @@ +// Test for incorrect use of __lsan_ignore_object(). +/* { dg-do run } */ +/* { dg-set-target-env-var LSAN_OPTIONS verbosity=2 } */ + +#include stdio.h +#include stdlib.h + +#ifdef __cplusplus +extern C +#endif +void __lsan_ignore_object(const void *p); + +int main() { + void *p = malloc(1337); + fprintf(stderr, Test alloc: %p.\n, p); + __lsan_ignore_object(p); + __lsan_ignore_object(p); + free(p); + __lsan_ignore_object(p); + return 0; +} + +/* { dg-output Test alloc: .* } */ +/* { dg-output heap object at .* is already being ignored.* } */ +/* { dg-output no heap object found at .* } */ diff --git a/gcc/testsuite/c-c++-common/lsan/large_allocation_leak.c b/gcc/testsuite/c-c++-common/lsan/large_allocation_leak.c new file mode 100644 index 000..36511d3 --- /dev/null +++ b/gcc/testsuite/c-c++-common/lsan/large_allocation_leak.c @@ -0,0 +1,19 @@ +// Test that LargeMmapAllocator's chunks aren't reachable via some internal data structure. +/* { dg-do run } */ +/* { dg-set-target-env-var LSAN_OPTIONS report_objects=1:use_stacks=0:use_registers=0 } */ +/* { dg-shouldfail lsan } */ + +#include stdio.h +#include stdlib.h + +int main() { + // maxsize in primary allocator is always less than this (1 25). + void *large_alloc = malloc(33554432); + fprintf(stderr, Test alloc: %p.\n, large_alloc); + return 0; +} + +/* { dg-output Test alloc: .* } */ +/* { dg-output Directly leaked 33554432 byte object at .* } */ +/* { dg-output LeakSanitizer: detected memory leaks.* } */ +/* { dg-output SUMMARY: (Leak|Address)Sanitizer } */ diff --git a/gcc/testsuite/c-c++-common/lsan/leak_check_at_exit-1.c b/gcc/testsuite/c-c++-common/lsan/leak_check_at_exit-1.c new file mode 100644
Re: [PATCH] Use the LTO linker plugin by default
On Wed, 12 Mar 2014, Rainer Orth wrote: Richard Biener rguent...@suse.de writes: On Mon, 10 Mar 2014, Rainer Orth wrote: Richard Biener rguent...@suse.de writes: Ouch. But as lto-plugin is a host module it should link against the host libgcc, no? During stage1, that is. So the question is why does it use the gcc/ compiler? For me it's using the host gcc: gcc -DHAVE_CONFIG_H -I. -I/space/rguenther/tramp3d/trunk/lto-plugin -I/space/rguenther/tramp3d/trunk/lto-plugin/../include -DHAVE_CONFIG_H -Wall -g -c /space/rguenther/tramp3d/trunk/lto-plugin/lto-plugin.c -fPIC -DPIC -o .libs/lto-plugin.o /bin/sh ./libtool --tag=CC --tag=disable-static --mode=link gcc -Wall -g -module -bindir /usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.9.0 -static-libstdc++ -static-libgcc -o liblto_plugin.la -rpath /usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.9.0 lto-plugin.lo -Wc,../libiberty/pic/libiberty.a libtool: link: gcc -shared .libs/lto-plugin.o ../libiberty/pic/libiberty.a -Wl,-soname -Wl,liblto_plugin.so.0 -o .libs/liblto_plugin.so.0.0.0 It does use the host compiler for me, too. So then if it succeeds to link to a shared libgcc_s then why is it not able to find that later? Maybe you miss setting of a suitable LD_LIBRARY_PATH to pick up the runtime for your host compiler? For the same reason that we use -static-libstdc++ to avoid this issue for libstdc++.so. I've always considered gcc's tendency to build binaries that don't run by default a major annoyance, all the weasel wording in the FAQ nonwithstanding. I hope to finally do something True, but if your host compiler builds sth then it's the host compiler installs business to make sure it can run ... (and thus make the libgcc_s it links to available or only provide a static libgcc_s). For this particular case at least. Note that I'm not against linking against static libgcc_s for lto-plugin. The -static-libstdc++ we use is just because during bootstrap picking up the correct libstdc++ was deemed too hard to implement and thus the easy way out was -static-libstdc++. about it for 4.10/5.0 (btw., any word on what the next release is going to be?). I guess for the simple reason of not breaking scripts we'll go for 5.0 (and very much hope the libstdc++ ABI issue will solve itself in time). I've also suggested to drop the major/minor version difference and go with 5.x, 6.x, 7.x releases going forward. We can have a bikeshedding BOF at the Cauldron. Richard.
[PATCH] BZ60501: Add addptr optab
Hi, fixes the LRA problems described in: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60501 and http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57604 Bootstrapped and regtested on s390, s390x, and x86_64. Ok? Bye, -Andreas- 2014-03-13 Andreas Krebbel andreas.kreb...@de.ibm.com PR rtl-optimization/60501 * optabs.def (addptr3_optab): New optab. * optabs.c (gen_addptr3_insn, have_addptr3_insn): New function. * doc/md.texi (addptrm3): Document new RTL standard expander. * expr.h (gen_addptr3_insn, have_addptr3_insn): Add prototypes. * lra.c (emit_add3_insn): Use the addptr pattern if available. * config/s390/s390.md (addptrdi3, addptrsi3): New expanders. diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md index 76902b5..7d9d1ad 100644 --- a/gcc/config/s390/s390.md +++ b/gcc/config/s390/s390.md @@ -5034,6 +5034,57 @@ [(set_attr op_type RRer,RXE) (set_attr type fsimpmode)]) +; +; Pointer add instruction patterns +; + +; This will match *la_64 +(define_expand addptrdi3 + [(set (match_operand:DI 0 register_operand ) +(plus:DI (match_operand:DI 1 register_operand ) +(match_operand:DI 2 nonmemory_operand )))] + TARGET_64BIT +{ + HOST_WIDE_INT c = INTVAL (operands[2]); + + if (GET_CODE (operands[2]) == CONST_INT) +{ + if (!CONST_OK_FOR_CONSTRAINT_P (c, 'K', K) + !CONST_OK_FOR_CONSTRAINT_P (c, 'O', Os)) +{ + operands[2] = force_const_mem (DImode, operands[2]); + operands[2] = force_reg (DImode, operands[2]); +} + else if (!DISP_IN_RANGE (INTVAL (operands[2]))) +operands[2] = force_reg (DImode, operands[2]); +} +}) + +; For 31 bit we have to prevent the generated pattern from matching +; normal ADDs since la only does a 31 bit add. This is supposed to +; match force_la_31. +(define_expand addptrsi3 + [(parallel +[(set (match_operand:SI 0 register_operand ) + (plus:SI (match_operand:SI 1 register_operand ) + (match_operand:SI 2 nonmemory_operand ))) + (use (const_int 0))])] + !TARGET_64BIT +{ + HOST_WIDE_INT c = INTVAL (operands[2]); + + if (GET_CODE (operands[2]) == CONST_INT) +{ + if (!CONST_OK_FOR_CONSTRAINT_P (c, 'K', K) + !CONST_OK_FOR_CONSTRAINT_P (c, 'O', Os)) +{ + operands[2] = force_const_mem (SImode, operands[2]); + operands[2] = force_reg (SImode, operands[2]); +} + else if (!DISP_IN_RANGE (INTVAL (operands[2]))) +operands[2] = force_reg (SImode, operands[2]); +} +}) ;; ;;- Subtract instructions. diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 746acc2..972b717 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -4720,6 +4720,17 @@ Add operand 2 and operand 1, storing the result in operand 0. All operands must have mode @var{m}. This can be used even on two-address machines, by means of constraints requiring operands 1 and 0 to be the same location. +@cindex @code{addptr@var{m}3} instruction pattern +@item @samp{addptr@var{m}3} +Like @code{addptr@var{m}3} but does never clobber the condition code. +It only needs to be defined if @code{add@var{m}3} either sets the +condition code or address calculations cannot be performed with the +normal add instructions due to other reasons. If adds used for +address calculations and normal adds are not compatible it is required +to expand a distinct pattern (e.g. using an unspec). The pattern is +used by LRA to emit address calculations. @code{add@var{m}3} is used +if @code{addptr@var{m}3} is not defined. + @cindex @code{ssadd@var{m}3} instruction pattern @cindex @code{usadd@var{m}3} instruction pattern @cindex @code{sub@var{m}3} instruction pattern diff --git a/gcc/expr.h b/gcc/expr.h index 5111f06..524da67 100644 --- a/gcc/expr.h +++ b/gcc/expr.h @@ -180,10 +180,12 @@ extern void emit_libcall_block (rtx, rtx, rtx, rtx); Likewise for subtraction and for just copying. */ extern rtx gen_add2_insn (rtx, rtx); extern rtx gen_add3_insn (rtx, rtx, rtx); +extern rtx gen_addptr3_insn (rtx, rtx, rtx); extern rtx gen_sub2_insn (rtx, rtx); extern rtx gen_sub3_insn (rtx, rtx, rtx); extern rtx gen_move_insn (rtx, rtx); extern int have_add2_insn (rtx, rtx); +extern int have_addptr3_insn (rtx, rtx, rtx); extern int have_sub2_insn (rtx, rtx); /* Emit a pair of rtl insns to compare two rtx's and to jump diff --git a/gcc/lra.c b/gcc/lra.c index 77074e2..e5e81474 100644 --- a/gcc/lra.c +++ b/gcc/lra.c @@ -254,6 +254,19 @@ emit_add3_insn (rtx x, rtx y, rtx z) rtx insn, last; last = get_last_insn (); + + if (have_addptr3_insn (x, y, z)) +{ + insn = gen_addptr3_insn (x, y, z); + + /* If the target provides an addptr pattern it hopefully does +for a reason. So falling back to the normal add would be +a bug. */ + lra_assert (insn != NULL_RTX); + emit_insn (insn); + return insn; +} + insn =
Re: [PATCH] Try to avoid sorting on SSA_NAME_VERSION during reassoc (PR middle-end/60418)
On Wed, 12 Mar 2014, Jakub Jelinek wrote: Hi! Apparently 435.gromacs benchmark is very sensitive (of course with -ffast-math) to reassociation ordering. We were sorting on SSA_NAME_VERSIONs, which has the disadvantage that we reuse SSA_NAME_VERSIONs from SSA_NAMEs dropped by earlier optimization passes and thus even minor changes in unrelated parts of function in unrelated optimizations can have very big effects on reassociation decisions. As discussed on IRC and in bugzilla, this patch attempts to sort on the ordering of SSA_NAME_DEF_STMT statements. If they are in different basic blocks, it uses bb_rank for sorting, if they are within the same bb, it checks which stmt dominates the other one in the bb (using gimple_uid). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok. Does this also fix the PPC regression? Thanks, Richard. 2014-03-12 Jakub Jelinek ja...@redhat.com PR tree-optimization/59025 PR middle-end/60418 * tree-ssa-reassoc.c (sort_by_operand_rank): For SSA_NAMEs with the same rank, sort by bb_rank and gimple_uid of SSA_NAME_DEF_STMT first. --- gcc/tree-ssa-reassoc.c.jj 2014-03-10 18:12:30.782215912 +0100 +++ gcc/tree-ssa-reassoc.c2014-03-12 10:09:03.341757696 +0100 @@ -219,6 +219,7 @@ static struct pointer_map_t *operand_ran /* Forward decls. */ static long get_rank (tree); +static bool reassoc_stmt_dominates_stmt_p (gimple, gimple); /* Bias amount for loop-carried phis. We want this to be larger than @@ -506,11 +507,37 @@ sort_by_operand_rank (const void *pa, co } /* Lastly, make sure the versions that are the same go next to each - other. We use SSA_NAME_VERSION because it's stable. */ + other. */ if ((oeb-rank - oea-rank == 0) TREE_CODE (oea-op) == SSA_NAME TREE_CODE (oeb-op) == SSA_NAME) { + /* As SSA_NAME_VERSION is assigned pretty randomly, because we reuse + versions of removed SSA_NAMEs, so if possible, prefer to sort + based on basic block and gimple_uid of the SSA_NAME_DEF_STMT. + See PR60418. */ + if (!SSA_NAME_IS_DEFAULT_DEF (oea-op) +!SSA_NAME_IS_DEFAULT_DEF (oeb-op) +SSA_NAME_VERSION (oeb-op) != SSA_NAME_VERSION (oea-op)) + { + gimple stmta = SSA_NAME_DEF_STMT (oea-op); + gimple stmtb = SSA_NAME_DEF_STMT (oeb-op); + basic_block bba = gimple_bb (stmta); + basic_block bbb = gimple_bb (stmtb); + if (bbb != bba) + { + if (bb_rank[bbb-index] != bb_rank[bba-index]) + return bb_rank[bbb-index] - bb_rank[bba-index]; + } + else + { + bool da = reassoc_stmt_dominates_stmt_p (stmta, stmtb); + bool db = reassoc_stmt_dominates_stmt_p (stmtb, stmta); + if (da != db) + return da ? 1 : -1; + } + } + if (SSA_NAME_VERSION (oeb-op) != SSA_NAME_VERSION (oea-op)) return SSA_NAME_VERSION (oeb-op) - SSA_NAME_VERSION (oea-op); else Jakub -- Richard Biener rguent...@suse.de SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer
Re: [PATCH] Fix PR60505
On Wed, 12 Mar 2014, Cong Hou wrote: Thank you for pointing it out. I didn't realized that alias analysis has influences on this issue. The current problem is that the epilogue may be unnecessary if the loop bound cannot be larger than the number of iterations of the vectorized loop multiplied by VF when the vectorized loop is supposed to be executed. My method is incorrect because I assume the vectorized loop will be executed which is actually guaranteed by loop bound check (and also alias checks). So if the alias checks exist, my method is fine as both conditions are met. But there is still the loop bound check which, if it fails, uses the epilogue loop as fallback, not the scalar versioned loop. If there is no alias checks, I must consider the possibility that the vectorized loop may not be executed at runtime and then the epilogue should not be eliminated. The warning appears on epilogue, and with loop bound checks (and without alias checks) the warning will be gone. So I think the key is alias checks: my method only works if there is no alias checks. How about adding one more condition that checks if alias checks are needed, as the code shown below? else if (LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) || (tree_ctz (LOOP_VINFO_NITERS (loop_vinfo)) (unsigned)exact_log2 (LOOP_VINFO_VECT_FACTOR (loop_vinfo)) (!LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo) || (unsigned HOST_WIDE_INT)max_stmt_executions_int (LOOP_VINFO_LOOP (loop_vinfo)) (unsigned)th))) LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo) = true; thanks, Cong On Wed, Mar 12, 2014 at 1:24 AM, Jakub Jelinek ja...@redhat.com wrote: On Tue, Mar 11, 2014 at 04:16:13PM -0700, Cong Hou wrote: This patch is fixing PR60505 in which the vectorizer may produce unnecessary epilogues. Bootstrapped and tested on a x86_64 machine. OK for trunk? That looks wrong. Consider the case where the loop isn't versioned, if you disable generation of the epilogue loop, you end up only with a vector loop. Say: unsigned char ovec[16] __attribute__((aligned (16))) = { 0 }; void foo (char *__restrict in, char *__restrict out, int num) { int i; in = __builtin_assume_aligned (in, 16); out = __builtin_assume_aligned (out, 16); for (i = 0; i num; ++i) out[i] = (ovec[i] = in[i]); out[num] = ovec[num / 2]; } -O2 -ftree-vectorize. Now, consider if this function is called with num != 16 (num 16 is of course invalid, but num 0 to 15 is valid and your patch will cause a wrong-code in this case). Jakub -- Richard Biener rguent...@suse.de SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer
Re: [PATCH] Try to avoid sorting on SSA_NAME_VERSION during reassoc (PR middle-end/60418)
On Thu, Mar 13, 2014 at 10:25:57AM +0100, Richard Biener wrote: Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok. Does this also fix the PPC regression? That is a question for Peter, if we can ask him to test it. From what I understood, the bug no longer reproduces on the trunk on ppc*, and Peter tested an earlier version of this patch on top of an old revision where it still reproduced. 2014-03-12 Jakub Jelinek ja...@redhat.com PR tree-optimization/59025 PR middle-end/60418 * tree-ssa-reassoc.c (sort_by_operand_rank): For SSA_NAMEs with the same rank, sort by bb_rank and gimple_uid of SSA_NAME_DEF_STMT first. Jakub
Re: [patch] make -flto -save-temps less verbose
On Thu, Mar 13, 2014 at 1:10 AM, Cesar Philippidis ce...@codesourcery.com wrote: I noticed that the lto-wrapper is a little noisy without the -v option when -save-temps is used. E.g., $ gcc main.c -flto -save-temps [Leaving LTRANS /tmp/ccSEvaB7.args] [Leaving LTRANS /tmp/ccQomDzb.ltrans.out] [Leaving LTRANS /tmp/ccVzWdGZ.args] [Leaving LTRANS /tmp/ccQomDzb.ltrans0.o] Those messages probably should be suppressed unless the user wants verbose diagnostics. They also show up as errors in the testsuite (although none currently use -save-temps with -flto, yet). The attached patch addresses this issue by disabling those messages unless the user passes -v to the driver. I've also included a simple test case which would fail without the change. Is this OK for stage-4? If so, please check it in since I don't have an SVN account. Ok (I'll check it in). Thanks, Richard. Thanks, Cesar
Re: [PATCH 1/4] [GOMP4] [Fortran] OpenACC 1.0+ support in fortran front-end
Committed as r208533. -- Ilmir.
Re: [PATCH] Use the LTO linker plugin by default
Richard Biener rguent...@suse.de writes: So then if it succeeds to link to a shared libgcc_s then why is it not able to find that later? Maybe you miss setting of a suitable LD_LIBRARY_PATH to pick up the runtime for your host compiler? For the same reason that we use -static-libstdc++ to avoid this issue for libstdc++.so. I've always considered gcc's tendency to build binaries that don't run by default a major annoyance, all the weasel wording in the FAQ nonwithstanding. I hope to finally do something True, but if your host compiler builds sth then it's the host compiler installs business to make sure it can run ... (and thus make the libgcc_s it links to available or only provide a static libgcc_s). Exactly my words. But gcc provides zero help for that. All proposed workarounds (having every user of gcc set LD_LIBRARY_PATH, messing around with ldconfig or equivalent, having every single compiler user provide -rpath/-R on his own) are usability or functionality nightmares one way or another: the first and second don't scale (imagine a large site with hundreds of machines and users), and the third imposes work on the user that the compiler can do best on its own. This may be somewhat acceptable for single-user/single-system installations of knowledgable users, but otherwise it's just a bad joke. The compiler has all the information when/how to pass -rpath/-R and should provide an option to do so. And if a target has different multilibs, the user suddenly needs to no not only about $libdir, but about the various multilib (sub)dirs. Not what I consider user-friendly, and I've seen a large enough share of somewhat experienced users fail with that. For this particular case at least. Note that I'm not against linking against static libgcc_s for lto-plugin. The -static-libstdc++ we use is just because during bootstrap picking up the correct libstdc++ was deemed too hard to implement and thus the easy way out was -static-libstdc++. So how should we go forward with this issue? This bootstrap failure is a regression from all previous releases. As I said, I'd rather not duplicate the -static-libgcc test from the toplevel, but would do so if all else fails. Perhaps Paolo could weigh in as the build maintainer? about it for 4.10/5.0 (btw., any word on what the next release is going to be?). I guess for the simple reason of not breaking scripts we'll go for 5.0 (and very much hope the libstdc++ ABI issue will solve itself in time). Yes, that would be amazing. I've also suggested to drop the major/minor version difference and go with 5.x, 6.x, 7.x releases going forward. We can have a bikeshedding BOF at the Cauldron. This is more than bikeshedding, I believe, because using major versions this ways suggests larger/incompatible changes to users when there probably none. And it looses us the ability to signal a real large change (like an ABI break). Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
[PATCH] Fix lto.exp dg-final error catching
While trying cleanup-saved-temps in a LTO testcase (which of course doesn't work ... error executing dg-final: bad level 5 (!??)) I ran into a TCL error printing the error - we use bogus variables. Fixed as obvious. Richard. 2014-03-13 Richard Biener rguent...@suse.de * lib/lto.exp (lto-execute): Fix error catching for dg-final. Index: gcc/testsuite/lib/lto.exp === --- gcc/testsuite/lib/lto.exp (revision 208532) +++ gcc/testsuite/lib/lto.exp (working copy) @@ -559,11 +559,11 @@ proc lto-execute { src1 sid } { verbose Running dg-final tests. 3 verbose dg-final-proc:\n[info body dg-final-proc] 4 if [catch dg-final-proc $src1 errmsg] { - perror $name: error executing dg-final: $errmsg + perror $src1: error executing dg-final: $errmsg # ??? The call to unresolved here is necessary to clear # `errcnt'. What we really need is a proc like perror that # doesn't set errcnt. It should also set exit_status to 1. - unresolved $name: error executing dg-final: $errmsg + unresolved $src1: error executing dg-final: $errmsg } }
Re: [PATCH] BZ60501: Add addptr optab
On Thu, Mar 13, 2014 at 10:24:13AM +0100, Andreas Krebbel wrote: --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -4720,6 +4720,17 @@ Add operand 2 and operand 1, storing the result in operand 0. All operands must have mode @var{m}. This can be used even on two-address machines, by means of constraints requiring operands 1 and 0 to be the same location. +@cindex @code{addptr@var{m}3} instruction pattern +@item @samp{addptr@var{m}3} +Like @code{addptr@var{m}3} but does never clobber the condition code. Didn't you mean Like @code{add@var{m}3} here? +int +have_addptr3_insn (rtx x, rtx y, rtx z) Missing function comment. Otherwise looks good to me, but please give Vladimir, Jeff or Eric 24 hours to comment on it. Jakub
Re: [PING^4][PATCH] Add a couple of dialect and warning options regarding Objective-C instance variable scope
Ping! On 03/06/2014 07:44 PM, Dimitris Papavasiliou wrote: Ping! On 02/27/2014 11:44 AM, Dimitris Papavasiliou wrote: Ping! On 02/20/2014 12:11 PM, Dimitris Papavasiliou wrote: Hello all, Pinging this patch review request again. See previous messages quoted below for details. Regards, Dimitris On 02/13/2014 04:22 PM, Dimitris Papavasiliou wrote: Hello, Pinging this patch review request. Can someone involved in the Objective-C language frontend have a quick look at the description of the proposed features and tell me if it'd be ok to have them in the trunk so I can go ahead and create proper patches? Thanks, Dimitris On 02/06/2014 11:25 AM, Dimitris Papavasiliou wrote: Hello, This is a patch regarding a couple of Objective-C related dialect options and warning switches. I have already submitted it a while ago but gave up after pinging a couple of times. I am now informed that should have kept pinging until I got someone's attention so I'm resending it. The patch is now against an old revision and as I stated originally it's probably not in a state that can be adopted as is. I'm sending it as is so that the implemented features can be assesed in terms of their usefulness and if they're welcome I'd be happy to make any necessary changes to bring it up-to-date, split it into smaller patches, add test-cases and anything else that is deemed necessary. Here's the relevant text from my initial message: Two of these switches are related to a feature request I submitted a while ago, Bug 56044 (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56044). I won't reproduce the entire argument here since it is available in the feature request. The relevant functionality in the patch comes in the form of two switches: -Wshadow-ivars which controls the local declaration of ‘somevar’ hides instance variable warning which curiously is enabled by default instead of being controlled at least by -Wshadow. The patch changes it so that this warning can be enabled and disabled specifically through -Wshadow-ivars as well as with all other shadowing-related warnings through -Wshadow. The reason for the extra switch is that, while searching through the Internet for a solution to this problem I have found out that other people are inconvenienced by this particular warning as well so it might be useful to be able to turn it off while keeping all the other shadowing-related warnings enabled. -flocal-ivars which when true, as it is by default, treats instance variables as having local scope. If false (-fno-local-ivars) instance variables must always be referred to as self-ivarname and references of ivarname resolve to the local or global scope as usual. I've also taken the opportunity of adding another switch unrelated to the above but related to instance variables: -fivar-visibility which can be set to either private, protected (the default), public and package. This sets the default instance variable visibility which normally is implicitly protected. My use-case for it is basically to be able to set it to public and thus effectively disable this visibility mechanism altogether which I find no use for and therefore have to circumvent. I'm not sure if anyone else feels the same way towards this but I figured it was worth a try. I'm attaching a preliminary patch against the current revision in case anyone wants to have a look. The changes are very small and any blatant mistakes should be immediately obvious. I have to admit to having virtually no knowledge of the internals of GCC but I have tried to keep in line with formatting guidelines and general style as well as looking up the particulars of the way options are handled in the available documentation to avoid blind copy-pasting. I have also tried to test the functionality both in my own (relatively large, or at least not too small) project and with small test programs and everything works as expected. Finallly, I tried running the tests too but these fail to complete both in the patched and unpatched version, possibly due to the way I've configured GCC. Dimitris
Re: [patch] make -flto -save-temps less verbose
On Thu, Mar 13, 2014 at 10:31 AM, Richard Biener richard.guent...@gmail.com wrote: On Thu, Mar 13, 2014 at 1:10 AM, Cesar Philippidis ce...@codesourcery.com wrote: I noticed that the lto-wrapper is a little noisy without the -v option when -save-temps is used. E.g., $ gcc main.c -flto -save-temps [Leaving LTRANS /tmp/ccSEvaB7.args] [Leaving LTRANS /tmp/ccQomDzb.ltrans.out] [Leaving LTRANS /tmp/ccVzWdGZ.args] [Leaving LTRANS /tmp/ccQomDzb.ltrans0.o] Those messages probably should be suppressed unless the user wants verbose diagnostics. They also show up as errors in the testsuite (although none currently use -save-temps with -flto, yet). The attached patch addresses this issue by disabling those messages unless the user passes -v to the driver. I've also included a simple test case which would fail without the change. Is this OK for stage-4? If so, please check it in since I don't have an SVN account. Ok (I'll check it in). I have not committed the testcase as it leaves the saved-temps files behind and /* { dg-final { cleanup-saved-temps } } */ doesn't work. May I ask you to see why and eventually fix it? Supposedly some weird TCL upvar stuff ... I get (after my lto.exp fix) Running /space/rguenther/src/svn/trunk/gcc/testsuite/gcc.dg/lto/lto.exp ... ERROR: /space/rguenther/src/svn/trunk/gcc/testsuite/gcc.dg/lto/save-temps_0.c: error executing dg-final: bad level 5 not sure how to set verboseness or debug that stuff (and no time to do that right now). Richard. Thanks, Richard. Thanks, Cesar
Re: [PATCH] Use the LTO linker plugin by default
On Thu, 13 Mar 2014, Rainer Orth wrote: Richard Biener rguent...@suse.de writes: So then if it succeeds to link to a shared libgcc_s then why is it not able to find that later? Maybe you miss setting of a suitable LD_LIBRARY_PATH to pick up the runtime for your host compiler? For the same reason that we use -static-libstdc++ to avoid this issue for libstdc++.so. I've always considered gcc's tendency to build binaries that don't run by default a major annoyance, all the weasel wording in the FAQ nonwithstanding. I hope to finally do something True, but if your host compiler builds sth then it's the host compiler installs business to make sure it can run ... (and thus make the libgcc_s it links to available or only provide a static libgcc_s). Exactly my words. But gcc provides zero help for that. All proposed workarounds (having every user of gcc set LD_LIBRARY_PATH, messing around with ldconfig or equivalent, having every single compiler user provide -rpath/-R on his own) are usability or functionality nightmares one way or another: the first and second don't scale (imagine a large site with hundreds of machines and users), and the third imposes work on the user that the compiler can do best on its own. This may be somewhat acceptable for single-user/single-system installations of knowledgable users, but otherwise it's just a bad joke. The compiler has all the information when/how to pass -rpath/-R and should provide an option to do so. And if a target has different multilibs, the user suddenly needs to no not only about $libdir, but about the various multilib (sub)dirs. Not what I consider user-friendly, and I've seen a large enough share of somewhat experienced users fail with that. Ah, we've done that in the past for compilers with libs in non-standard install locations ... cat $RPM_BUILD_ROOT%{libsubdir}/defaults.spec EOF *link: + %%{!m32:%%{!m64:-rpath=%{libsubdir}}} %%{m32:-rpath=%{libsubdir}/32} %%{m64:-rpath=%{libsubdir}/64} EOF together with the following (ISTR it was posted but never approved / applied): Index: gcc/gcc.c === --- gcc/gcc.c.orig 2012-11-28 10:36:38.0 +0100 +++ gcc/gcc.c 2012-12-11 12:30:30.053124280 +0100 @@ -260,6 +260,7 @@ static const char *replace_outfile_spec_ static const char *remove_outfile_spec_function (int, const char **); static const char *version_compare_spec_function (int, const char **); static const char *include_spec_function (int, const char **); +static const char *include_noerr_spec_function (int, const char **); static const char *find_file_spec_function (int, const char **); static const char *find_plugindir_spec_function (int, const char **); static const char *print_asm_header_spec_function (int, const char **); @@ -1293,6 +1294,7 @@ static const struct spec_function static { remove-outfile, remove_outfile_spec_function }, { version-compare, version_compare_spec_function }, { include, include_spec_function }, + { include_noerr,include_noerr_spec_function }, { find-file, find_file_spec_function }, { find-plugindir, find_plugindir_spec_function }, { print-asm-header,print_asm_header_spec_function }, @@ -6382,6 +6384,8 @@ main (int argc, char **argv) if (access (specs_file, R_OK) == 0) read_specs (specs_file, true, false); + do_self_spec (%:include_noerr(defaults.spec)%(default_spec)); + /* Process any configure-time defaults specified for the command line options, via OPTION_DEFAULT_SPECS. */ for (i = 0; i ARRAY_SIZE (option_default_specs); i++) @@ -8271,6 +8275,21 @@ get_random_number (void) return ret ^ getpid(); } +static const char * +include_noerr_spec_function (int argc, const char **argv) +{ + char *file; + + if (argc != 1) +abort (); + + file = find_a_file (startfile_prefixes, argv[0], R_OK, 0); + if (file) +read_specs (file, FALSE, TRUE); + + return NULL; +} + /* %:compare-debug-dump-opt spec function. Save the last argument, expected to be the last -fdump-final-insns option, or generate a temporary. */ For this particular case at least. Note that I'm not against linking against static libgcc_s for lto-plugin. The -static-libstdc++ we use is just because during bootstrap picking up the correct libstdc++ was deemed too hard to implement and thus the easy way out was -static-libstdc++. So how should we go forward with this issue? This bootstrap failure is a regression from all previous releases. As I said, I'd rather not duplicate the -static-libgcc test from the toplevel, but would do so if all else fails. Perhaps Paolo could weigh in as the build maintainer? Yeah, I'd like a build maintainer to look over your first proposed patch (workaround libtools nicyness). Richard.
Re: [Patch][google/main] Fix arm build broken
On 12/03/14 22:35, Hán Shěn (沈涵) wrote: ARM build (on chrome) is broken because of duplicate entries in arm.md and unspecs.md. Fixed by removing duplication and merge those in arm.md into unspecs.md. (We had a similar fix for google/gcc-4_8 here - http://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=198650) Tested by building arm cross compiler successfully. Ok for google/main? Sounds to me like a merge botch. UNSPEC_SIN and UNSPEC_COS were removed from trunk some time back, when the old FPA code was removed. I very much doubt that you need to be re-adding them. R. Patch below - diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 8b269a4..9aec213 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -75,27 +75,6 @@ ] ) -;; UNSPEC Usage: -;; Note: sin and cos are no-longer used. -;; Unspec enumerators for Neon are defined in neon.md. - -(define_c_enum unspec [ - UNSPEC_SIN; `sin' operation (MODE_FLOAT): -; operand 0 is the result, -; operand 1 the parameter. - UNPSEC_COS; `cos' operation (MODE_FLOAT): -; operand 0 is the result, -; operand 1 the parameter. - UNSPEC_PROLOGUE_USE ; As USE insns are not meaningful after reload, -; this unspec is used to prevent the deletion of -; instructions setting registers for EH handling -; and stack frame generation. Operand 0 is the -; register to use. - UNSPEC_WMADDS ; Used by the intrinsic form of the iWMMXt WMADDS instruction. - UNSPEC_WMADDU ; Used by the intrinsic form of the iWMMXt WMADDU instruction. - UNSPEC_GOT_PREL_SYM ; Specify an R_ARM_GOT_PREL relocation of a symbol. -]) - ;; UNSPEC_VOLATILE Usage: diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md index 8caa953..89bc528 100644 --- a/gcc/config/arm/unspecs.md +++ b/gcc/config/arm/unspecs.md @@ -24,6 +24,12 @@ ;; Unspec enumerators for iwmmxt2 are defined in iwmmxt2.md (define_c_enum unspec [ + UNSPEC_SIN; `sin' operation (MODE_FLOAT): +; operand 0 is the result, +; operand 1 the parameter. + UNPSEC_COS; `cos' operation (MODE_FLOAT): +; operand 0 is the result, +; operand 1 the parameter. UNSPEC_PUSH_MULT ; `push multiple' operation: ; operand 0 is the first register, ; subsequent registers are in parallel (use ...) @@ -58,6 +64,7 @@ ; instruction stream. UNSPEC_PIC_OFFSET ; A symbolic 12-bit OFFSET that has been treated ; correctly for PIC usage. + UNSPEC_GOT_PREL_SYM ; Specify an R_ARM_GOT_PREL relocation of a symbol. UNSPEC_GOTSYM_OFF ; The offset of the start of the GOT from a ; a given symbolic address. UNSPEC_THUMB1_CASESI ; A Thumb1 compressed dispatch-table call. @@ -70,6 +77,11 @@ ; that. UNSPEC_UNALIGNED_STORE ; Same for str/strh. UNSPEC_PIC_UNIFIED; Create a common pic addressing form. + UNSPEC_PROLOGUE_USE ; As USE insns are not meaningful after reload, +; this unspec is used to prevent the deletion of +; instructions setting registers for EH handling +; and stack frame generation. Operand 0 is the +; register to use. UNSPEC_LL ; Represent an unpaired load-register-exclusive. UNSPEC_VRINTZ ; Represent a float to integral float rounding ; towards zero. @@ -87,6 +99,8 @@ (define_c_enum unspec [ UNSPEC_WADDC ; Used by the intrinsic form of the iWMMXt WADDC instruction. + UNSPEC_WMADDS ; Used by the intrinsic form of the iWMMXt WMADDS instruction. + UNSPEC_WMADDU ; Used by the intrinsic form of the iWMMXt WMADDU instruction. UNSPEC_WABS ; Used by the intrinsic form of the iWMMXt WABS instruction. UNSPEC_WQMULWMR ; Used by the intrinsic form of the iWMMXt WQMULWMR instruction. UNSPEC_WQMULMR ; Used by the intrinsic form of the iWMMXt WQMULMR instruction. Han
Re: [PATCH] Use the LTO linker plugin by default
Richard Biener rguent...@suse.de writes: Exactly my words. But gcc provides zero help for that. All proposed workarounds (having every user of gcc set LD_LIBRARY_PATH, messing around with ldconfig or equivalent, having every single compiler user provide -rpath/-R on his own) are usability or functionality nightmares one way or another: the first and second don't scale (imagine a large site with hundreds of machines and users), and the third imposes work on the user that the compiler can do best on its own. This may be somewhat acceptable for single-user/single-system installations of knowledgable users, but otherwise it's just a bad joke. The compiler has all the information when/how to pass -rpath/-R and should provide an option to do so. And if a target has different multilibs, the user suddenly needs to no not only about $libdir, but about the various multilib (sub)dirs. Not what I consider user-friendly, and I've seen a large enough share of somewhat experienced users fail with that. Ah, we've done that in the past for compilers with libs in non-standard install locations ... cat $RPM_BUILD_ROOT%{libsubdir}/defaults.spec EOF *link: + %%{!m32:%%{!m64:-rpath=%{libsubdir}}} %%{m32:-rpath=%{libsubdir}/32} %%{m64:-rpath=%{libsubdir}/64} EOF together with the following (ISTR it was posted but never approved / applied): Index: gcc/gcc.c === --- gcc/gcc.c.orig2012-11-28 10:36:38.0 +0100 +++ gcc/gcc.c 2012-12-11 12:30:30.053124280 +0100 @@ -260,6 +260,7 @@ static const char *replace_outfile_spec_ static const char *remove_outfile_spec_function (int, const char **); static const char *version_compare_spec_function (int, const char **); static const char *include_spec_function (int, const char **); +static const char *include_noerr_spec_function (int, const char **); static const char *find_file_spec_function (int, const char **); static const char *find_plugindir_spec_function (int, const char **); static const char *print_asm_header_spec_function (int, const char **); @@ -1293,6 +1294,7 @@ static const struct spec_function static { remove-outfile,remove_outfile_spec_function }, { version-compare, version_compare_spec_function }, { include, include_spec_function }, + { include_noerr,include_noerr_spec_function }, { find-file, find_file_spec_function }, { find-plugindir,find_plugindir_spec_function }, { print-asm-header, print_asm_header_spec_function }, @@ -6382,6 +6384,8 @@ main (int argc, char **argv) if (access (specs_file, R_OK) == 0) read_specs (specs_file, true, false); + do_self_spec (%:include_noerr(defaults.spec)%(default_spec)); + /* Process any configure-time defaults specified for the command line options, via OPTION_DEFAULT_SPECS. */ for (i = 0; i ARRAY_SIZE (option_default_specs); i++) @@ -8271,6 +8275,21 @@ get_random_number (void) return ret ^ getpid(); } +static const char * +include_noerr_spec_function (int argc, const char **argv) +{ + char *file; + + if (argc != 1) +abort (); + + file = find_a_file (startfile_prefixes, argv[0], R_OK, 0); + if (file) +read_specs (file, FALSE, TRUE); + + return NULL; +} + /* %:compare-debug-dump-opt spec function. Save the last argument, expected to be the last -fdump-final-insns option, or generate a temporary. */ Certainly a step in the right direction, though it will add RPATH in more cases than necessary. I've a couple of ideas for that, and would like to use $ORIGIN if available for inter-runtime lib dependencies to retain relocatability. I'll see how far I come during the next release cycle. For this particular case at least. Note that I'm not against linking against static libgcc_s for lto-plugin. The -static-libstdc++ we use is just because during bootstrap picking up the correct libstdc++ was deemed too hard to implement and thus the easy way out was -static-libstdc++. So how should we go forward with this issue? This bootstrap failure is a regression from all previous releases. As I said, I'd rather not duplicate the -static-libgcc test from the toplevel, but would do so if all else fails. Perhaps Paolo could weigh in as the build maintainer? Yeah, I'd like a build maintainer to look over your first proposed patch (workaround libtools nicyness). Just one additional data point: I've checked mainline libtool, and it still doesn't handle (meaning: still drops) -static-libgcc/-static-libstdc++. At least they have some hints in their documentation on what testing etc. it takes to get additional options passed through to the compiler/linker. Rainer -- - Rainer Orth, Center for Biotechnology,
[wwwdocs] Update gcc-4.9/porting_to.html w.r.t null pointer checks
Committed. Index: htdocs/gcc-4.9/porting_to.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.9/porting_to.html,v retrieving revision 1.4 diff -u -r1.4 porting_to.html --- htdocs/gcc-4.9/porting_to.html 7 Mar 2014 19:45:14 - 1.4 +++ htdocs/gcc-4.9/porting_to.html 13 Mar 2014 10:17:07 - @@ -60,6 +60,36 @@ code#pragma omp end declare target/code directive, this is now a parsing error./p +h3Null pointer checks may be optimized away more aggressively/h3 + +p GCC might now optimize away the null pointer check in code like:/p + +precode + int copy (int* dest, int* src, size_t nbytes) { +memmove (dest, src, nbytes); +if (src != NULL) + return *src; +return 0; + } +/code/pre + +pThe pointers passed to codememmove/code (and similar functions in +codelt;string.hgt;/code) must be non-null even when +codenbytes==0/code, so GCC can use that information to remove the check +after the codememmove/code call. Calling codecopy(p, NULL, 0)/code +can therefore deference a null pointer and crash./p + +pThe example above needs to be fixed to avoid the invalid +codememmove/code call, for example:/p + +precode +if (nbytes != 0) + memmove (dest, src, nbytes); +/code/pre + +pThis optimization can also affect implicit null pointer checks such as +the one done by the C++ runtime for the codedelete[]/code operator./p + h2C language issues/h2 h3Right operand of comma operator without effect/h3
[PATCH, libiberty]: Avoid 'right-hand operand of comma expression has no effect' when compiling regex.c
Hello! When compiling regex.c from libiberty, several warnings are emitted: ../../gcc-svn/trunk/libiberty/regex.c: In function 'byte_regex_compile': ../../gcc-svn/trunk/libiberty/regex.c:154:47: warning: right-hand operand of comma expression has no effect [-Wunused-value] # define bzero(s, n) (memset (s, '\0', n), (s)) ^ ../../gcc-svn/trunk/libiberty/regex.c:3126:13: note: in expansion of macro 'bzero' bzero (b, (1 BYTEWIDTH) / BYTEWIDTH); ^ Attached patch changes the return value of the bzero macro to void, as defined in a 4.3BSD: void bzero(void *s, size_t n); As an additional benefit, the changed macro now generates warning when its return value is used (which is *not* the case in regex.c): --cut here-- int *arr; # define bzero(s, n) (memset (s, '\0', n), (void) 0) void test (void) { void *t = bzero (arr, 1); (void) t; } --cut here-- gcc -O2 -Wall bz.c: In function 'test': bz.c:7:27: error: void value not ignored as it ought to be # define bzero(s, n) (memset (s, '\0', n), (void) 0) ^ bz.c:11:13: note: in expansion of macro 'bzero' void *t = bzero (arr, 1); ^ 2014-03-13 Uros Bizjak ubiz...@gmail.com * regex.c (bzero) [!_LIBC]: Change return value to void. Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu. OK for mainline (and release branches, perhaps)? Uros. Index: regex.c === --- regex.c (revision 208529) +++ regex.c (working copy) @@ -151,7 +151,7 @@ char *realloc (); #include string.h #ifndef bzero # ifndef _LIBC -# define bzero(s, n) (memset (s, '\0', n), (s)) +# define bzero(s, n) (memset (s, '\0', n), (void) 0) # else # define bzero(s, n) __bzero (s, n) # endif
Re: [PATCH] BZ60501: Add addptr optab
--- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -4720,6 +4720,17 @@ Add operand 2 and operand 1, storing the result in operand 0. All operands must have mode @var{m}. This can be used even on two-address machines, by means of constraints requiring operands 1 and 0 to be the same location. +@cindex @code{addptr@var{m}3} instruction pattern +@item @samp{addptr@var{m}3} +Like @code{addptr@var{m}3} but does never clobber the condition code. +It only needs to be defined if @code{add@var{m}3} either sets the +condition code or address calculations cannot be performed with the +normal add instructions due to other reasons. If adds used for +address calculations and normal adds are not compatible it is required +to expand a distinct pattern (e.g. using an unspec). The pattern is +used by LRA to emit address calculations. @code{add@var{m}3} is used +if @code{addptr@var{m}3} is not defined. I'm a bit skeptical of the address calculations cannot be performed with the normal add instructions due to other reasons part. Surely they can be performed on all architectures supported by GCC as of this writing, otherwise how would the compiler even work? And if it's really like @code{add@var{m}3}, why restricting it to addresses, i.e. why calling it @code{addptr@var{m}3}? Does that come from an implementation constraint on s390 that supports it only for a subset of the cases supported by @code{add@var{m}3}? diff --git a/gcc/lra.c b/gcc/lra.c index 77074e2..e5e81474 100644 --- a/gcc/lra.c +++ b/gcc/lra.c @@ -254,6 +254,19 @@ emit_add3_insn (rtx x, rtx y, rtx z) rtx insn, last; last = get_last_insn (); + + if (have_addptr3_insn (x, y, z)) +{ + insn = gen_addptr3_insn (x, y, z); + + /* If the target provides an addptr pattern it hopefully does + for a reason. So falling back to the normal add would be + a bug. */ + lra_assert (insn != NULL_RTX); + emit_insn (insn); + return insn; +} + insn = emit_insn (gen_rtx_SET (VOIDmode, x, gen_rtx_PLUS (GET_MODE (y), y, z))); if (recog_memoized (insn) 0) Same ambiguity here, emit_add3_insn is not documented as being restricted to addresses: /* Emit insn x = y + z. Return NULL if we failed to do it. Otherwise, return the insn. We don't use gen_add3_insn as it might clobber CC. */ static rtx emit_add3_insn (rtx x, rtx y, rtx z) -- Eric Botcazou
[PATCH][match-and-simplify] Properly simplify before failing with !seq
This makes sure we simplify an expression before giving up because the caller only expects a singleton result. Otherwise patterns like (match_and_simplify (plus @0 integer_zerop) @0) (match_and_simplify (plus (plus @0 INTEGER_CST_P@1) INTEGER_CST_P@2) (plus @0 (plus @1 @2))) that expect to constant-fold (plus @1 @2) and the resulting (plus @0 0) for (a + 1) + -1 won't match when called with !seq (as for example SCCVN does). Bootstrapped and tested on x86_64-unknown-linux-gnu and applied. Richard. 2014-03-13 Richard Biener rguent...@suse.de * genmatch.c (expr::gen_gimple_transform): Make sure to try simplifying a transform pattern even when seq is NULL as seq may not be needed. Index: gcc/genmatch.c === --- gcc/genmatch.c (revision 208478) +++ gcc/genmatch.c (working copy) @@ -332,20 +332,32 @@ void expr::gen_gimple_transform (FILE *f, const char *label) { fprintf (f, ({\n); - fprintf (f, if (!seq) ); - gen_gimple_match_fail (f, label); - fprintf (f, tree ops[%d];\n, ops.length ()); + fprintf (f, tree ops[%d], res;\n, ops.length ()); for (unsigned i = 0; i ops.length (); ++i) { - fprintf (f,ops[%u] = , i); + fprintf (f, ops[%u] = , i); ops[i]-gen_gimple_transform (f, label); fprintf (f, ;\n); } - fprintf (f, gimple_build (seq, UNKNOWN_LOCATION, %s, TREE_TYPE (ops[0]), + /* ??? Have another helper that is like gimple_build but may + fail if seq == NULL. */ + fprintf (f, if (!seq)\n + {\n +res = gimple_match_and_simplify (%s, TREE_TYPE (ops[0]), operation-op-id); for (unsigned i = 0; i ops.length (); ++i) fprintf (f, , ops[%u], i); + fprintf (f, , seq, valueize);\n); + fprintf (f, if (!res) ); + gen_gimple_match_fail (f, label); + fprintf (f, }\n); + fprintf (f, else\n); + fprintf (f, res = gimple_build (seq, UNKNOWN_LOCATION, %s, + TREE_TYPE (ops[0]), operation-op-id); + for (unsigned i = 0; i ops.length (); ++i) +fprintf (f, , ops[%u], i); fprintf (f, , valueize);\n); + fprintf (f, res;\n); fprintf (f, })); }
Re: [PATCH] BZ60501: Add addptr optab
On Thu, Mar 13, 2014 at 12:16 PM, Eric Botcazou ebotca...@adacore.com wrote: --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -4720,6 +4720,17 @@ Add operand 2 and operand 1, storing the result in operand 0. All operands must have mode @var{m}. This can be used even on two-address machines, by means of constraints requiring operands 1 and 0 to be the same location. +@cindex @code{addptr@var{m}3} instruction pattern +@item @samp{addptr@var{m}3} +Like @code{addptr@var{m}3} but does never clobber the condition code. +It only needs to be defined if @code{add@var{m}3} either sets the +condition code or address calculations cannot be performed with the +normal add instructions due to other reasons. If adds used for +address calculations and normal adds are not compatible it is required +to expand a distinct pattern (e.g. using an unspec). The pattern is +used by LRA to emit address calculations. @code{add@var{m}3} is used +if @code{addptr@var{m}3} is not defined. I'm a bit skeptical of the address calculations cannot be performed with the normal add instructions due to other reasons part. Surely they can be performed on all architectures supported by GCC as of this writing, otherwise how would the compiler even work? And if it's really like @code{add@var{m}3}, why restricting it to addresses, i.e. why calling it @code{addptr@var{m}3}? Does that come from an implementation constraint on s390 that supports it only for a subset of the cases supported by @code{add@var{m}3}? Yeah, isn't it that you want a named pattern like add_nocc for an add that doesn't clobber flags? Richard. diff --git a/gcc/lra.c b/gcc/lra.c index 77074e2..e5e81474 100644 --- a/gcc/lra.c +++ b/gcc/lra.c @@ -254,6 +254,19 @@ emit_add3_insn (rtx x, rtx y, rtx z) rtx insn, last; last = get_last_insn (); + + if (have_addptr3_insn (x, y, z)) +{ + insn = gen_addptr3_insn (x, y, z); + + /* If the target provides an addptr pattern it hopefully does + for a reason. So falling back to the normal add would be + a bug. */ + lra_assert (insn != NULL_RTX); + emit_insn (insn); + return insn; +} + insn = emit_insn (gen_rtx_SET (VOIDmode, x, gen_rtx_PLUS (GET_MODE (y), y, z))); if (recog_memoized (insn) 0) Same ambiguity here, emit_add3_insn is not documented as being restricted to addresses: /* Emit insn x = y + z. Return NULL if we failed to do it. Otherwise, return the insn. We don't use gen_add3_insn as it might clobber CC. */ static rtx emit_add3_insn (rtx x, rtx y, rtx z) -- Eric Botcazou
Re: [PATCH 1/4] [GOMP4] [Fortran] OpenACC 1.0+ support in fortran front-end
Hi Ilmir! On Thu, 13 Mar 2014 13:34:54 +0400, Ilmir Usmanov i.usma...@samsung.com wrote: Committed as r208533. Yay! \o/ Three minor things: Please move the entries from gcc/ChangeLog.gomp to the more specific gcc/[...]/ChangeLog.gomp files. Generally, we need to locate and then put ChangeLog snippets into the most specific ChangeLog file (so, for example, gcc/testsuite/ChangeLog for gcc/testsuite/gfortran.dg/goacc/assumed.f95), and as this is on the gomp-4_0-branch, then use (or create, if not yet present) the ChangeLog.gomp file instead of plain ChangeLog. The following change seems the right thing to do -- but why doesn't the current code trigger a GCC ICE due to a failing subcode check? (At least I thought you had test cases exercising the respective OpenACC vector clauses?) Can you please check that? If it's just missing a test case, feel free to commit that together with my patch. commit ee65334ec81b092111e9b2b34a0ee3ceb933b643 Author: Thomas Schwinge tho...@codesourcery.com Date: Thu Mar 13 12:26:47 2014 +0100 Fix OMP_CLAUSE_VECTOR_EXPR subcode check. gcc/ * tree.h (OMP_CLAUSE_VECTOR_EXPR): Check for OMP_CLAUSE_VECTOR instead of OMP_CLAUSE_VECTOR_LENGTH. diff --git gcc/tree.h gcc/tree.h index bd70680..5ef2a0a 100644 --- gcc/tree.h +++ gcc/tree.h @@ -1323,7 +1323,7 @@ extern void protected_set_expr_location (tree, location_t); OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_WAIT), 0) #define OMP_CLAUSE_VECTOR_EXPR(NODE) \ OMP_CLAUSE_OPERAND ( \ -OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_VECTOR_LENGTH), 0) +OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_VECTOR), 0) #define OMP_CLAUSE_WORKER_EXPR(NODE) \ OMP_CLAUSE_OPERAND ( \ OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_WORKER), 0) The following cleanup should be fine to check in, or is there a reason for using OMP_WAIT_EXPR instead of OMP_CLAUSE_WAIT_EXPR? commit 7d69bdf8471e512791d4b7e0121efde7725a0cb9 Author: Thomas Schwinge tho...@codesourcery.com Date: Thu Mar 13 12:25:14 2014 +0100 Rename OMP_WAIT_EXPR to OMP_CLAUSE_WAIT_EXPR. gcc/ * tree.h (OMP_WAIT_EXPR): Rename to OMP_CLAUSE_WAIT_EXPR. Change all users. diff --git gcc/fortran/trans-openmp.c gcc/fortran/trans-openmp.c index a1abd66..29364f4 100644 --- gcc/fortran/trans-openmp.c +++ gcc/fortran/trans-openmp.c @@ -1191,7 +1191,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, tree wait_var = gfc_convert_expr_to_tree (block, clauses-non_clause_wait_expr); c = build_omp_clause (where.lb-location, OMP_CLAUSE_WAIT); - OMP_WAIT_EXPR (c)= wait_var; + OMP_CLAUSE_WAIT_EXPR (c)= wait_var; omp_clauses = gfc_trans_add_clause (c, omp_clauses); } diff --git gcc/tree.h gcc/tree.h index fbac81b..bd70680 100644 --- gcc/tree.h +++ gcc/tree.h @@ -1318,7 +1318,7 @@ extern void protected_set_expr_location (tree, location_t); #define OMP_CLAUSE_ASYNC_EXPR(NODE) \ OMP_CLAUSE_OPERAND ( \ OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_ASYNC), 0) -#define OMP_WAIT_EXPR(NODE) \ +#define OMP_CLAUSE_WAIT_EXPR(NODE) \ OMP_CLAUSE_OPERAND ( \ OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_WAIT), 0) #define OMP_CLAUSE_VECTOR_EXPR(NODE) \ Grüße, Thomas pgpDNKodWJvl2.pgp Description: PGP signature
Re: [PATCH] Try to avoid sorting on SSA_NAME_VERSION during reassoc (PR middle-end/60418)
On Thu, 2014-03-13 at 10:30 +0100, Jakub Jelinek wrote: On Thu, Mar 13, 2014 at 10:25:57AM +0100, Richard Biener wrote: Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok. Does this also fix the PPC regression? That is a question for Peter, if we can ask him to test it. From what I understood, the bug no longer reproduces on the trunk on ppc*, and Peter tested an earlier version of this patch on top of an old revision where it still reproduced. Actually, I think it was Pat who did the testing earlier. Pat, can you confirm Jakub's patch fixes the PPC regression? Peter 2014-03-12 Jakub Jelinek ja...@redhat.com PR tree-optimization/59025 PR middle-end/60418 * tree-ssa-reassoc.c (sort_by_operand_rank): For SSA_NAMEs with the same rank, sort by bb_rank and gimple_uid of SSA_NAME_DEF_STMT first.
Re: [PATCH 1/4] [GOMP4] [Fortran] OpenACC 1.0+ support in fortran front-end
Hi Thomas! On 13.03.2014 15:38, Thomas Schwinge wrote: Hi Ilmir! On Thu, 13 Mar 2014 13:34:54 +0400, Ilmir Usmanov i.usma...@samsung.com wrote: Committed as r208533. Yay! \o/ Thanks! Yay indeed. And thanks a lot for your comments. The following change seems the right thing to do -- but why doesn't the current code trigger a GCC ICE due to a failing subcode check? (At least I thought you had test cases exercising the respective OpenACC vector clauses?) Can you please check that? If it's just missing a test case, feel free to commit that together with my patch. Unfortunally, I can't check this using a test right now, since OpenACC loop directive is not implemented yet. The following cleanup should be fine to check in, or is there a reason for using OMP_WAIT_EXPR instead of OMP_CLAUSE_WAIT_EXPR? No, there is no reason for using OMP_WAIT_EXPR. Thanks! -- Ilmir.
[patch,avr] Fix PR59396: Ignore leading '*' in warning generation for ISR names
This is again a request for approval to fix PR59396. Problem is that the assembler name might or might not be prefixed by '*' depending on when TARGET_SET_CURRENT_FUNCTION is called. The change is just to fix wrong warning because the current implementation of TARGET_SET_CURRENT_FUNCTION /always/ skips the first char when the assembler name is set. A leading '*' is used for handling of leading underscores and I don't intend to interfere or change that handling in any way. Thus I think changes outside the avr back are not indicated at all. The original approval has been denied because the overall '*' handling was not changed. I still think this is not appropriate in the present case. I am not aware of any cases where leading underscores does not work as expected. Besides that avr does not use leading underscores at all. Moreover, some built-ins (ab)use the assembler name so set the function's name directly. Again, I don't indent to change that target-independent code in any way. Thus, ok to apply? Johann PR target/59396 * config/avr/avr.c (avr_set_current_function): Skip the first char of the (assembler) name provided it's a '*'. Index: config/avr/avr.c === --- config/avr/avr.c (revision 208532) +++ config/avr/avr.c (working copy) @@ -600,10 +600,15 @@ avr_set_current_function (tree decl) const char *name; name = DECL_ASSEMBLER_NAME_SET_P (decl) -/* Remove the leading '*' added in set_user_assembler_name. */ -? 1 + IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)) +? IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)) : IDENTIFIER_POINTER (DECL_NAME (decl)); + /* Skip a leading '*' that might still prefix the assembler name, + e.g. in non-LTO runs. */ + + if (*name == '*') +name++; + /* Silently ignore 'signal' if 'interrupt' is present. AVR-LibC startet using this when it switched from SIGNAL and INTERRUPT to ISR. */
[C++ Patch] PR 60383
Hi, this 4.9 regression is again an ICE during error recovery: check_specialization_namespace errors out, maybe_process_partial_specialization doesn't check its return value, and the ICE happens much later, in retrieve_specialization, in the gcc_assert: /* There should be as many levels of arguments as there are levels of parameters. */ gcc_assert (TMPL_ARGS_DEPTH (args) == (TREE_CODE (tmpl) == TEMPLATE_DECL ? TMPL_PARMS_DEPTH (DECL_TEMPLATE_PARMS (tmpl)) : template_class_depth (DECL_CONTEXT (tmpl; Simply checking the return value of check_specialization_namespace and early returning error_mark_node appears to work well, with the minor complication that check_specialization_namespace may return false also in case of permerror, thus the !at_namespace_scope_p. Other than that, the tweak to crash95.C doesn't seem bad to me (for example, it aligns our diagnostic to that of current clang) Tested x86_64-linux. Thanks, Paolo. /cp 2014-03-13 Paolo Carlini paolo.carl...@oracle.com PR c++/60383 * pt.c (maybe_process_partial_specialization): Check return value of check_specialization_namespace. /testsuite 2014-03-13 Paolo Carlini paolo.carl...@oracle.com PR c++/60383 * g++.dg/template/crash118.C: New. * g++.dg/template/crash95.C: Adjust. Index: cp/pt.c === --- cp/pt.c (revision 208538) +++ cp/pt.c (working copy) @@ -850,7 +850,9 @@ maybe_process_partial_specialization (tree type) if (CLASSTYPE_IMPLICIT_INSTANTIATION (type) !COMPLETE_TYPE_P (type)) { - check_specialization_namespace (CLASSTYPE_TI_TEMPLATE (type)); + if (!check_specialization_namespace (CLASSTYPE_TI_TEMPLATE (type)) + !at_namespace_scope_p ()) + return error_mark_node; SET_CLASSTYPE_TEMPLATE_SPECIALIZATION (type); DECL_SOURCE_LOCATION (TYPE_MAIN_DECL (type)) = input_location; if (processing_template_decl) Index: testsuite/g++.dg/template/crash118.C === --- testsuite/g++.dg/template/crash118.C(revision 0) +++ testsuite/g++.dg/template/crash118.C(working copy) @@ -0,0 +1,11 @@ +// PR c++/60383 + +templateint struct A +{ + templatetypename struct B + { +templatetypename T struct BT* {}; // { dg-error specialization } + }; +}; + +A0::Bchar* b; Index: testsuite/g++.dg/template/crash95.C === --- testsuite/g++.dg/template/crash95.C (revision 208538) +++ testsuite/g++.dg/template/crash95.C (working copy) @@ -8,4 +8,4 @@ template typename struct S }; }; -S int s(0); // { dg-error incomplete type } +S int s(0); // { dg-error no matching }
[PATCH] LeakSanitizer testsuite
On 03/13/2014 12:33 PM, Maxim Ostapenko wrote: Hi, This patch adds initial set of tests and dg infrastructure for LeakSanitizer runtime. Tested on x86_64. Ok to commit? -Maxim Fixed subject. 2014-03-13 Max Ostapenko m.ostape...@partner.samsung.com * c-c++-common/lsan/fork.c: New test. * c-c++-common/lsan/ignore_object.c: New test. * c-c++-common/lsan/ignore_object_errors.c: New test. * c-c++-common/lsan/large_allocation_leak.c: New test. * c-c++-common/lsan/leak_check_at_exit-1.c: New test. * c-c++-common/lsan/leak_check_at_exit-2.c: New test. * c-c++-common/lsan/link_turned_off.c: New test. * c-c++-common/lsan/pointer_to_self.c: New test. * c-c++-common/lsan/suppressions_default.c: New test. * c-c++-common/lsan/swapcontext-1.c: New test. * c-c++-common/lsan/swapcontext-2.c: New test. * c-c++-common/lsan/use_after_return.c: New test. * c-c++-common/lsan/use_globals_initialized.c: New test. * c-c++-common/lsan/use_globals_uninitialized.c: New test. * c-c++-common/lsan/use_stacks.c: New test. * c-c++-common/lsan/use_tls_static.c: New test. * c-c++-common/lsan/use_unaligned.c: New test. * g++.dg/lsan/lsan.exp: New file. * gcc.dg/lsan/lsan.exp: New file. * lib/lsan-dg.exp: New file. diff --git a/gcc/testsuite/c-c++-common/lsan/fork.c b/gcc/testsuite/c-c++-common/lsan/fork.c new file mode 100644 index 000..4dc9d4b --- /dev/null +++ b/gcc/testsuite/c-c++-common/lsan/fork.c @@ -0,0 +1,23 @@ +// Test that thread local data is handled correctly after forking without exec(). +/* { dg-do run } */ + +#include assert.h +#include stdio.h +#include stdlib.h +#include sys/wait.h +#include unistd.h + +__thread void *thread_local_var; + +int main() { + int status = 0; + thread_local_var = malloc(1337); + pid_t pid = fork(); + assert(pid = 0); + if (pid 0) { +waitpid(pid, status, 0); +assert(WIFEXITED(status)); +return WEXITSTATUS(status); + } + return 0; +} diff --git a/gcc/testsuite/c-c++-common/lsan/ignore_object.c b/gcc/testsuite/c-c++-common/lsan/ignore_object.c new file mode 100644 index 000..d73f08b --- /dev/null +++ b/gcc/testsuite/c-c++-common/lsan/ignore_object.c @@ -0,0 +1,31 @@ +// Test for __lsan_ignore_object(). + +/* { dg-do run } */ +/* { dg-set-target-env-var LSAN_OPTIONS report_objects=1:use_registers=0:use_stacks=0:use_globals=0:use_tls=0:verbosity=3 } */ +/* { dg-set-target-env-var ASAN_OPTIONS verbosity=3 } */ +/* { dg-shouldfail lsan } */ + +#include stdio.h +#include stdlib.h + +#ifdef __cplusplus +extern C +#endif +void __lsan_ignore_object(void **p); + +int main() { + // Explicitly ignored object. + void **p = (void **) malloc(sizeof(void **)); + // Transitively ignored object. + *p = malloc(666); + // Non-ignored object. + volatile void *q = malloc(1337); + fprintf(stderr, Test alloc: %p.\n, p); + fprintf(stderr, Test alloc_2: %p.\n, q); + __lsan_ignore_object(p); + return 0; +} + +/* { dg-output Test alloc: .* } */ +/* { dg-output ignoring heap object at .* } */ +/* { dg-output SUMMARY: (Leak|Address)Sanitizer: 1337 byte\\(s\\) leaked in 1 allocation\\(s\\).* } */ diff --git a/gcc/testsuite/c-c++-common/lsan/ignore_object_errors.c b/gcc/testsuite/c-c++-common/lsan/ignore_object_errors.c new file mode 100644 index 000..47a1cd1 --- /dev/null +++ b/gcc/testsuite/c-c++-common/lsan/ignore_object_errors.c @@ -0,0 +1,25 @@ +// Test for incorrect use of __lsan_ignore_object(). +/* { dg-do run } */ +/* { dg-set-target-env-var LSAN_OPTIONS verbosity=2 } */ + +#include stdio.h +#include stdlib.h + +#ifdef __cplusplus +extern C +#endif +void __lsan_ignore_object(const void *p); + +int main() { + void *p = malloc(1337); + fprintf(stderr, Test alloc: %p.\n, p); + __lsan_ignore_object(p); + __lsan_ignore_object(p); + free(p); + __lsan_ignore_object(p); + return 0; +} + +/* { dg-output Test alloc: .* } */ +/* { dg-output heap object at .* is already being ignored.* } */ +/* { dg-output no heap object found at .* } */ diff --git a/gcc/testsuite/c-c++-common/lsan/large_allocation_leak.c b/gcc/testsuite/c-c++-common/lsan/large_allocation_leak.c new file mode 100644 index 000..36511d3 --- /dev/null +++ b/gcc/testsuite/c-c++-common/lsan/large_allocation_leak.c @@ -0,0 +1,19 @@ +// Test that LargeMmapAllocator's chunks aren't reachable via some internal data structure. +/* { dg-do run } */ +/* { dg-set-target-env-var LSAN_OPTIONS report_objects=1:use_stacks=0:use_registers=0 } */ +/* { dg-shouldfail lsan } */ + +#include stdio.h +#include stdlib.h + +int main() { + // maxsize in primary allocator is always less than this (1 25). + void *large_alloc = malloc(33554432); + fprintf(stderr, Test alloc: %p.\n, large_alloc); + return 0; +} + +/* { dg-output Test alloc: .* } */ +/* { dg-output Directly leaked 33554432 byte object at .* } */ +/* { dg-output LeakSanitizer: detected memory leaks.* } */ +/* { dg-output SUMMARY: (Leak|Address)Sanitizer } */ diff --git a/gcc/testsuite/c-c++-common/lsan/leak_check_at_exit-1.c
Re: [PATCH 1/4] [GOMP4] [Fortran] OpenACC 1.0+ support in fortran front-end
On 13.03.2014 16:19, Ilmir Usmanov wrote: The following change seems the right thing to do -- but why doesn't the current code trigger a GCC ICE due to a failing subcode check? (At least I thought you had test cases exercising the respective OpenACC vector clauses?) Can you please check that? If it's just missing a test case, feel free to commit that together with my patch. Unfortunally, I can't check this using a test right now, since OpenACC loop directive is not implemented yet. The following cleanup should be fine to check in, or is there a reason for using OMP_WAIT_EXPR instead of OMP_CLAUSE_WAIT_EXPR? No, there is no reason for using OMP_WAIT_EXPR. Thanks! Committed as r208541. -- Ilmir.
[PATCH][match-and-simplify] Fix call handling some more
Committed. Richard. 2014-03-13 Richard Biener rguent...@suse.de * gimple-match-head.c (gimple_resimplify2, gimple_resimplify3): Implement missing call handling. (gimple_match_and_simplify): Fix lhs gathering. Index: gcc/gimple-match-head.c === *** gcc/gimple-match-head.c (revision 208482) --- gcc/gimple-match-head.c (working copy) *** gimple_resimplify2 (gimple_seq *seq, *** 121,133 code_helper *res_code, tree type, tree *res_ops, tree (*valueize)(tree)) { - /* FIXME. */ - if (!res_code-is_tree_code ()) - gcc_unreachable (); if (CONSTANT_CLASS_P (res_ops[0]) CONSTANT_CLASS_P (res_ops[1])) { ! tree tem = fold_binary_to_constant (*res_code, type, ! res_ops[0], res_ops[1]); if (tem != NULL_TREE) { res_ops[0] = tem; --- 121,137 code_helper *res_code, tree type, tree *res_ops, tree (*valueize)(tree)) { if (CONSTANT_CLASS_P (res_ops[0]) CONSTANT_CLASS_P (res_ops[1])) { ! tree tem; ! if (res_code-is_tree_code ()) ! tem = fold_binary_to_constant (*res_code, type, ! res_ops[0], res_ops[1]); ! else ! { ! tree decl = builtin_decl_implicit (*res_code); ! tem = fold_builtin_n (UNKNOWN_LOCATION, decl, res_ops, 2, false); ! } if (tem != NULL_TREE) { res_ops[0] = tem; *** gimple_resimplify2 (gimple_seq *seq, *** 137,143 } /* Canonicalize operand order. */ ! if (commutative_tree_code (*res_code) tree_swap_operands_p (res_ops[0], res_ops[1], false)) { tree tem = res_ops[0]; --- 141,148 } /* Canonicalize operand order. */ ! if (res_code-is_tree_code () !commutative_tree_code (*res_code) tree_swap_operands_p (res_ops[0], res_ops[1], false)) { tree tem = res_ops[0]; *** gimple_resimplify3 (gimple_seq *seq, *** 168,181 code_helper *res_code, tree type, tree *res_ops, tree (*valueize)(tree)) { - /* FIXME. */ - if (!res_code-is_tree_code ()) - gcc_unreachable (); if (CONSTANT_CLASS_P (res_ops[0]) CONSTANT_CLASS_P (res_ops[1]) CONSTANT_CLASS_P (res_ops[2])) { ! tree tem = fold_ternary/*_to_constant*/ (*res_code, type, res_ops[0], ! res_ops[1], res_ops[2]); if (tem != NULL_TREE CONSTANT_CLASS_P (tem)) { --- 173,190 code_helper *res_code, tree type, tree *res_ops, tree (*valueize)(tree)) { if (CONSTANT_CLASS_P (res_ops[0]) CONSTANT_CLASS_P (res_ops[1]) CONSTANT_CLASS_P (res_ops[2])) { ! tree tem; ! if (res_code-is_tree_code ()) ! tem = fold_ternary/*_to_constant*/ (*res_code, type, res_ops[0], ! res_ops[1], res_ops[2]); ! else ! { ! tree decl = builtin_decl_implicit (*res_code); ! tem = fold_builtin_n (UNKNOWN_LOCATION, decl, res_ops, 3, false); ! } if (tem != NULL_TREE CONSTANT_CLASS_P (tem)) { *** gimple_resimplify3 (gimple_seq *seq, *** 186,192 } /* Canonicalize operand order. */ ! if (commutative_ternary_tree_code (*res_code) tree_swap_operands_p (res_ops[0], res_ops[1], false)) { tree tem = res_ops[0]; --- 195,202 } /* Canonicalize operand order. */ ! if (res_code-is_tree_code () !commutative_ternary_tree_code (*res_code) tree_swap_operands_p (res_ops[0], res_ops[1], false)) { tree tem = res_ops[0]; *** gimple_match_and_simplify (gimple stmt, *** 483,490 } } - /* ??? GIMPLE_CALL handling to be implemented. */ - return false; } --- 493,498 *** gimple_match_and_simplify (gimple_stmt_i *** 532,538 else if (gimple_has_lhs (stmt)) { gimple_seq tail = NULL; ! tree lhs = gimple_call_lhs (stmt); maybe_push_res_to_seq (rcode, TREE_TYPE (lhs), ops, tail, lhs); gcc_assert (gimple_seq_singleton_p (tail)); --- 540,546 else if (gimple_has_lhs (stmt)) { gimple_seq tail = NULL; ! tree lhs = gimple_get_lhs (stmt); maybe_push_res_to_seq (rcode, TREE_TYPE (lhs), ops, tail, lhs); gcc_assert (gimple_seq_singleton_p (tail));
[PATCH][match-and-simplify] Split out combine from forwprop (a bit)
This physically separates forwprop and combining all stmts to allow an easy lattice implementation. Committed. Richard. 2014-03-13 Richard Biener rguent...@suse.de * tree-ssa-forwprop.c (ssa_forward_propagate_and_combine): Split out combining with gimple_match_and_simplify to a separate pass. Implement a lattice. Index: gcc/tree-ssa-forwprop.c === *** gcc/tree-ssa-forwprop.c (revision 208478) --- gcc/tree-ssa-forwprop.c (working copy) *** along with GCC; see the file COPYING3. *** 52,57 --- 52,59 #include optabs.h #include tree-ssa-propagate.h #include tree-ssa-dom.h + #include tree-cfgcleanup.h + #include tree-into-ssa.h /* This pass propagates the RHS of assignment statements into use sites of the LHS of the assignment. It's basically a specialized *** simplify_mult (gimple_stmt_iterator *gsi *** 3571,3576 --- 3573,3580 } + static vectree lattice; + /* Primitive lattice function for gimple_match_and_simplify to discard matches on names whose definition contains abnormal SSA names. */ *** static tree *** 3578,3589 fwprop_ssa_val (tree name) { if (TREE_CODE (name) == SSA_NAME ! /* ??? Instead match-and-simplify should make sure to not ! return a sequence that references abnormal SSA names. */ !(SSA_NAME_OCCURS_IN_ABNORMAL_PHI (name) ! || (gimple_code (SSA_NAME_DEF_STMT (name)) != GIMPLE_PHI ! stmt_references_abnormal_ssa_name (SSA_NAME_DEF_STMT (name) ! return NULL_TREE; return name; } --- 3582,3593 fwprop_ssa_val (tree name) { if (TREE_CODE (name) == SSA_NAME !SSA_NAME_VERSION (name) lattice.length ()) ! { ! tree val = lattice[SSA_NAME_VERSION (name)]; ! if (val) ! return val; ! } return name; } *** ssa_forward_propagate_and_combine (void) *** 3598,3603 --- 3602,3665 cfg_changed = false; + /* Combine stmts with the stmts defining their operands. Do that + in an order that guarantees visiting SSA defs before SSA uses. */ + lattice.create (num_ssa_names); + lattice.quick_grow_cleared (num_ssa_names); + int *postorder = XNEWVEC (int, n_basic_blocks_for_fn (cfun)); + int postorder_num = inverted_post_order_compute (postorder); + for (int i = 0; i postorder_num; ++i) + { + bb = BASIC_BLOCK_FOR_FN (cfun, postorder[i]); + /* ??? Maybe want to handle degenerate PHIs to populate +the lattice more optimistically. */ + for (gimple_stmt_iterator gsi = gsi_start_bb (bb); + !gsi_end_p (gsi); gsi_next (gsi)) + { + gimple stmt = gsi_stmt (gsi); + + if (gimple_match_and_simplify (gsi, fwprop_ssa_val)) + { + stmt = gsi_stmt (gsi); + if (maybe_clean_or_replace_eh_stmt (stmt, stmt) + gimple_purge_dead_eh_edges (bb)) + cfg_changed = true; + if (dump_file (dump_flags TDF_DETAILS)) + { + fprintf (dump_file, gimple_match_and_simplified to ); + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); + } + } + + /* Fill up the lattice. */ + if (gimple_assign_single_p (stmt)) + { + tree lhs = gimple_assign_lhs (stmt); + tree rhs = gimple_assign_rhs1 (stmt); + if (TREE_CODE (lhs) == SSA_NAME) + { + if (TREE_CODE (rhs) == SSA_NAME) + lattice[SSA_NAME_VERSION (lhs)] = fwprop_ssa_val (rhs); + else if (is_gimple_min_invariant (rhs)) + lattice[SSA_NAME_VERSION (lhs)] = rhs; + else + lattice[SSA_NAME_VERSION (lhs)] = lhs; + } + } + } + } + free (postorder); + lattice.release (); + + /* ??? Code below doesn't expect non-renamed VOPs and the above + doesn't keep virtual operand form up-to-date. */ + if (cfg_changed) + { + cleanup_tree_cfg (); + cfg_changed = false; + } + update_ssa (TODO_update_ssa_only_virtuals); + FOR_EACH_BB_FN (bb, cfun) { gimple_stmt_iterator gsi; *** ssa_forward_propagate_and_combine (void) *** 3699,3719 /* Mark stmt as potentially needing revisiting. */ gimple_set_plf (stmt, GF_PLF_1, false); - if (gimple_match_and_simplify (gsi, fwprop_ssa_val)) - { - stmt = gsi_stmt (gsi); - if (maybe_clean_or_replace_eh_stmt (stmt, stmt) - gimple_purge_dead_eh_edges (bb)) - cfg_changed = true; - changed = true; - if (dump_file (dump_flags TDF_DETAILS)) - { - fprintf (dump_file, gimple_match_and_simplified to ); -
Re: [PATCH] Try to avoid sorting on SSA_NAME_VERSION during reassoc (PR middle-end/60418)
On 03/13/2014 06:43 AM, Peter Bergner wrote: On Thu, 2014-03-13 at 10:30 +0100, Jakub Jelinek wrote: On Thu, Mar 13, 2014 at 10:25:57AM +0100, Richard Biener wrote: Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok. Does this also fix the PPC regression? That is a question for Peter, if we can ask him to test it. From what I understood, the bug no longer reproduces on the trunk on ppc*, and Peter tested an earlier version of this patch on top of an old revision where it still reproduced. Actually, I think it was Pat who did the testing earlier. Pat, can you confirm Jakub's patch fixes the PPC regression? Yes, this latest version of the patch also fixes the PPC regression on r204348 (last known revision to fail for PPC).
Re: [PATCH] Try to avoid sorting on SSA_NAME_VERSION during reassoc (PR middle-end/60418)
On Thu, Mar 13, 2014 at 09:53:47AM -0500, Pat Haugen wrote: On 03/13/2014 06:43 AM, Peter Bergner wrote: On Thu, 2014-03-13 at 10:30 +0100, Jakub Jelinek wrote: On Thu, Mar 13, 2014 at 10:25:57AM +0100, Richard Biener wrote: Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok. Does this also fix the PPC regression? That is a question for Peter, if we can ask him to test it. From what I understood, the bug no longer reproduces on the trunk on ppc*, and Peter tested an earlier version of this patch on top of an old revision where it still reproduced. Actually, I think it was Pat who did the testing earlier. Pat, can you confirm Jakub's patch fixes the PPC regression? Yes, this latest version of the patch also fixes the PPC regression on r204348 (last known revision to fail for PPC). Thanks. Jakub
[PATCH][match-and-simplify] Complete call support, add GENERIC matching support (kind-of)
The following finishes call matching support (see the added pattern and the testcase amendment). It also adds basic support for matching GENERIC expressions and calls - to eventually handle the weird case of GIMPLE containing a GENERIC tcc_comparison as operand zero of a COND_EXPR rhs. Committed. Richard. 2014-03-13 Richard Biener rguent...@suse.de * match.pd: Add pattern matching a call in a sub-expression. * genmatch.c (expr::gen_gimple_match): Complete call handling. Support matching GENERIC. testsuite/ * gcc.dg/tree-ssa/match-1.c: Adjust. Index: gcc/match.pd === --- gcc/match.pd(revision 208478) +++ gcc/match.pd(working copy) @@ -159,6 +202,10 @@ to (minus @1 @0) (match_and_simplify (BUILT_IN_CABS (complex @0 @0)) (mult (BUILT_IN_FABS @0) { build_real (TREE_TYPE (@0), real_value_truncate (TYPE_MODE (TREE_TYPE (@0)), dconst_sqrt2 ())); })) +/* One nested fn. */ +(match_and_simplify + (mult (BUILT_IN_POW @0 @1) @0) + (BUILT_IN_POW @0 (PLUS_EXPR @1 { build_one_cst (TREE_TYPE (@1)); }))) /* s Index: gcc/genmatch.c === --- gcc/genmatch.c (revision 208539) +++ gcc/genmatch.c (working copy) @@ -278,9 +278,9 @@ expr::gen_gimple_match (FILE *f, const c if (operation-op-kind == id_base::CODE) { operator_id *op = static_cast operator_id * (operation-op); - fprintf (f, {\n); - fprintf (f, if (TREE_CODE (%s) != SSA_NAME) , name); - gen_gimple_match_fail (f, label); + /* The GIMPLE variant. */ + fprintf (f, if (TREE_CODE (%s) == SSA_NAME)\n, name); + fprintf (f, {\n); fprintf (f, gimple def_stmt = SSA_NAME_DEF_STMT (%s);\n, name); fprintf (f, if (!is_gimple_assign (def_stmt)\n || gimple_assign_rhs_code (def_stmt) != %s) , op-id); @@ -321,11 +321,83 @@ expr::gen_gimple_match (FILE *f, const c fprintf (f,}\n); } } - fprintf (f, }\n); + fprintf (f, }\n); + /* The GENERIC variant. */ + fprintf (f, else if (TREE_CODE (%s) == %s)\n, name, op-id); + fprintf (f, {\n); + for (unsigned i = 0; i ops.length (); ++i) + { + fprintf (f,{\n); + fprintf (f, tree op = TREE_OPERAND (%s, %d);\n, name, i); + fprintf (f, if (valueize TREE_CODE (op) == SSA_NAME)\n); + fprintf (f,{\n); + fprintf (f, op = valueize (op);\n); + fprintf (f, if (!op) ); + gen_gimple_match_fail (f, label); + fprintf (f,}\n); + ops[i]-gen_gimple_match (f, op, label); + fprintf (f,}\n); + } + fprintf (f, }\n); + fprintf (f, else ); + gen_gimple_match_fail (f, label); +} + else if (operation-op-kind == id_base::FN) +{ + fn_id *op = static_cast fn_id * (operation-op); + /* The GIMPLE variant. */ + fprintf (f, if (TREE_CODE (%s) == SSA_NAME)\n, name); + fprintf (f, {\n); + fprintf (f, gimple def_stmt = SSA_NAME_DEF_STMT (%s);\n, name); + fprintf (f, tree fndecl;\n); + fprintf (f, if (!gimple_call_builtin_p (def_stmt, %s)) , op-id); + gen_gimple_match_fail (f, label); + for (unsigned i = 0; i ops.length (); ++i) + { + fprintf (f,{\n); + fprintf (f, tree op = gimple_call_arg (def_stmt, %d);\n, i); + fprintf (f, if (valueize TREE_CODE (op) == SSA_NAME)\n); + fprintf (f,{\n); + fprintf (f, op = valueize (op);\n); + fprintf (f, if (!op) ); + gen_gimple_match_fail (f, label); + fprintf (f,}\n); + ops[i]-gen_gimple_match (f, op, label); + fprintf (f,}\n); + } + fprintf (f, }\n); + /* GENERIC handling for calls. */ + fprintf (f, else if (TREE_CODE (%s) == CALL_EXPR\n + TREE_CODE (CALL_EXPR_FN (%s)) == ADDR_EXPR\n + TREE_CODE (TREE_OPERAND (CALL_EXPR_FN (%s), 0)) == FUNCTION_DECL\n + DECL_BUILT_IN_CLASS (TREE_OPERAND (CALL_EXPR_FN (%s), 0)) == BUILT_IN_NORMAL\n + DECL_FUNCTION_CODE (TREE_OPERAND (CALL_EXPR_FN (%s), 0)) == %s)\n, + name, name, name, name, name, op-id); + fprintf (f, {\n); + for (unsigned i = 0; i ops.length (); ++i) + { + fprintf (f,{\n); + fprintf (f, tree op = CALL_EXPR_ARG (%s, %d);\n, name, i); + fprintf (f, if (valueize TREE_CODE (op) == SSA_NAME)\n); + fprintf (f,{\n); + fprintf (f, op = valueize (op);\n); + fprintf (f, if (!op) ); + gen_gimple_match_fail (f, label); + fprintf (f,}\n); + ops[i]-gen_gimple_match (f, op, label); + fprintf (f,}\n); + } + fprintf
Re: [C++ Patch] PR 60383
OK. Jason
[wwwdocs] Update list of 4.9 secondary platforms
Hi! I've been told that the SC has approved the secondary platform list changes, so I went ahead and committed the changes to our web pages. --- gcc-4.9/criteria.html 15 Mar 2013 16:39:45 - 1.1 +++ gcc-4.9/criteria.html 13 Mar 2014 15:01:00 - @@ -110,12 +110,12 @@ application testing./p pThe secondary platforms are:/p ul -lihppa2.0w-hp-hpux11.11/li lipowerpc-ibm-aix7.1.0.0/li lii686-apple-darwin/li lii686-pc-cygwin/li lii686-mingw32/li lis390x-linux-gnu/li +liaarch64-elf/li /ul h1Code Quality and Compilation Time/h1 Jakub
Re: [wwwdocs] Update list of 4.9 secondary platforms
On 03/13/14 09:02, Jakub Jelinek wrote: Hi! I've been told that the SC has approved the secondary platform list changes, so I went ahead and committed the changes to our web pages. Thanks. I should have taken care of this a few weeks ago when the decision was made. Hopefully everyone realizes that the committee is just dropping an effectively dead platform and adding a more vibrant platform in the secondary platform list. There were no changes to the primary platforms. Jeff
Re: [PATCH, PR58066] preferred_stack_boundary update for tls expanded call
On Wed, Mar 12, 2014 at 10:52 PM, Wei Mi w...@google.com wrote: I saw the problem last patch had on ia32. Without explicit call in rtl template, scheduler may schedule the sp adjusting insn across tls descriptor and break the alignment assumption. I am testing the updated patch on x86_64. Can we combine the last two patches, both adding call explicitly in rtl template for tls_local_dynamic_base_32/tls_global_dynamic_32, and set ix86_tls_descriptor_calls_expanded_in_cfun to true only after reload complete? My ia32 change generates much worse code: [hjl@gnu-6 gcc]$ cat /tmp/c.i static __thread char ccc, bbb; int __cxa_get_globals() { return ccc - bbb; } [hjl@gnu-6 gcc]$ ./xgcc -B./ -S -O2 -fPIC /tmp/c.i [hjl@gnu-6 gcc]$ cat c.s .file c.i .section .text.unlikely,ax,@progbits .LCOLDB0: .text .LHOTB0: .p2align 4,,15 .globl __cxa_get_globals .type __cxa_get_globals, @function __cxa_get_globals: .LFB0: .cfi_startproc subq $8, %rsp .cfi_def_cfa_offset 16 leaq ccc@tlsld(%rip), %rdi call __tls_get_addr@PLT addq $8, %rsp .cfi_def_cfa_offset 8 leaq ccc@dtpoff(%rax), %rcx leaq bbb@dtpoff(%rax), %rdx movq %rcx, %rax subq %rdx, %rax ret .cfi_endproc .LFE0: .size __cxa_get_globals, .-__cxa_get_globals .section .text.unlikely .LCOLDE0: .text .LHOTE0: .section .tbss,awT,@nobits .type bbb, @object .size bbb, 1 bbb: .zero 1 .type ccc, @object .size ccc, 1 ccc: .zero 1 .ident GCC: (GNU) 4.9.0 20140312 (experimental) .section .note.GNU-stack,,@progbits [hjl@gnu-6 gcc]$ cat /tmp/c.i static __thread char ccc, bbb; int __cxa_get_globals() { return ccc - bbb; } [hjl@gnu-6 gcc]$ ./xgcc -B./ -S -O2 -fPIC /tmp/c.i -m32 [hjl@gnu-6 gcc]$ cat c.s .file c.i .section .text.unlikely,ax,@progbits .LCOLDB0: .text .LHOTB0: .p2align 4,,15 .globl __cxa_get_globals .type __cxa_get_globals, @function __cxa_get_globals: .LFB0: .cfi_startproc pushl %esi .cfi_def_cfa_offset 8 .cfi_offset 6, -8 pushl %ebx .cfi_def_cfa_offset 12 .cfi_offset 3, -12 call __x86.get_pc_thunk.bx addl $_GLOBAL_OFFSET_TABLE_, %ebx subl $4, %esp .cfi_def_cfa_offset 16 leal ccc@tlsldm(%ebx), %eax call ___tls_get_addr@PLT leal ccc@dtpoff(%eax), %esi leal ccc@tlsldm(%ebx), %eax call ___tls_get_addr@PLT addl $4, %esp .cfi_def_cfa_offset 12 leal bbb@dtpoff(%eax), %eax popl %ebx .cfi_restore 3 .cfi_def_cfa_offset 8 subl %eax, %esi movl %esi, %eax popl %esi .cfi_restore 6 .cfi_def_cfa_offset 4 ret .cfi_endproc Maybe we should keep the original patterns and split them to add CALL. -- H.J.
Re: [PATCH, PR58066] preferred_stack_boundary update for tls expanded call
pr58066-2.patch worked for pr58066.c on ia32/x32/x86_64, but it failed on bootstrap. /usr/local/google/home/wmi/workarea/gcc-r208410-2/build/./gcc/xgcc -B/usr/local/google/home/wmi/workarea/gcc-r208410-2/build/./gcc/ -B/usr/local/google/home/wmi/workarea/gcc-r208410-2/build/install/x86_64-unknown-linux-gnu/bin/ -B/usr/local/google/home/wmi/workarea/gcc-r208410-2/build/install/x86_64-unknown-linux-gnu/lib/ -isystem /usr/local/google/home/wmi/workarea/gcc-r208410-2/build/install/x86_64-unknown-linux-gnu/include -isystem /usr/local/google/home/wmi/workarea/gcc-r208410-2/build/install/x86_64-unknown-linux-gnu/sys-include -g -O2 -m64 -O2 -g -O2 -DIN_GCC-W -Wall -Wwrite-strings -Wcast-qual -Wno-format -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -isystem ./include -fpic -mlong-double-80 -g -DIN_LIBGCC2 -fbuilding-libgcc -fno-stack-protector -fpic -mlong-double-80 -I. -I. -I../../.././gcc -I../../../../src/libgcc -I../../../../src/libgcc/. -I../../../../src/libgcc/../gcc -I../../../../src/libgcc/../include -I../../../../src/libgcc/config/libbid -DENABLE_DECIMAL_BID_FORMAT -DHAVE_CC_TLS -DUSE_TLS -o bid_decimal_globals.o -MT bid_decimal_globals.o -MD -MP -MF bid_decimal_globals.dep -c ../../../../src/libgcc/config/libbid/bid_decimal_globals.c (call_insn 5 2 6 2 (parallel [ (set (reg/f:SI 85) (call:SI (mem:QI (symbol_ref:SI (___tls_get_addr)) [0 S1 A8]) (const_int 0 [0]))) (unspec:SI [ (reg:SI 3 bx) (symbol_ref:SI (__bid_IDEC_glbflags) [flags 0x10] var_decl 0x761c1da8 __bid_IDEC_glbflags) ] UNSPEC_TLS_GD) (clobber (reg:SI 91)) (clobber (reg:SI 92)) (clobber (reg:CC 17 flags)) ]) ../../../../src/libgcc/config/libbid/bid_decimal_globals.c:51 772 {*tls_global_dynamic_32_gnu} (expr_list:REG_UNUSED (reg:SI 92) (expr_list:REG_UNUSED (reg:SI 91) (nil))) (nil)) ../../../../src/libgcc/config/libbid/bid_decimal_globals.c:52:1: internal compiler error: in curr_insn_transform, at lra-constraints.c:3262 0xad8453 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*) ../../src/gcc/rtl-error.c:109 0x9d1221 curr_insn_transform ../../src/gcc/lra-constraints.c:3262 0x9d40e4 lra_constraints(bool) ../../src/gcc/lra-constraints.c:4157 0x9c0ad8 lra(_IO_FILE*) ../../src/gcc/lra.c:2340 0x96e310 do_reload ../../src/gcc/ira.c:5457 0x96e622 rest_of_handle_reload ../../src/gcc/ira.c:5598 0x96e66c execute ../../src/gcc/ira.c:5627 The problem is the return value of the call may be assigned to a different hardreg than AX_REG. But LRA cannot do reload for output operand of call. The fix is to change the above pattern to the following pattern in legitimize_tls_address() in config/i386/i386.c. (call_insn/u 5 4 6 (parallel [ (set (reg:SI 0 ax) (call:SI (mem:QI (symbol_ref:SI (___tls_get_addr)) [0 S1 A8]) (const_int 0 [0]))) (unspec:SI [ (reg:SI 3 bx) (symbol_ref:SI (__bid_IDEC_glbflags) [flags 0x10] var_decl 0x75ef3da8 __bid_IDEC_glbflags) ] UNSPEC_TLS_GD) (clobber (scratch:SI)) (clobber (scratch:SI)) (clobber (reg:CC 17 flags)) ]) ../../../../src/libgcc/config/libbid/bid_decimal_globals.c:51 -1 (expr_list:REG_EH_REGION (const_int -2147483648 [0x8000]) (nil)) (nil)) (insn 6 5 7 (set (reg/f:SI 85) (reg:SI 0 ax)) ../../../../src/libgcc/config/libbid/bid_decimal_globals.c:51 -1 (expr_list:REG_EQUAL (symbol_ref:SI (__bid_IDEC_glbflags) [flags 0x10] var_decl 0x75ef3da8 __bid_IDEC_glbflags) After the problem is fixed, bootstrap and regression test on x86-64 are ok. Thanks, Wei. Index: config/i386/i386.md === --- config/i386/i386.md (revision 208410) +++ config/i386/i386.md (working copy) @@ -12859,13 +12859,14 @@ (define_insn *tls_global_dynamic_32_gnu [(set (match_operand:SI 0 register_operand =a) - (unspec:SI - [(match_operand:SI 1 register_operand b) - (match_operand 2 tls_symbolic_operand) - (match_operand 3 constant_call_address_operand z)] - UNSPEC_TLS_GD)) - (clobber (match_scratch:SI 4 =d)) - (clobber (match_scratch:SI 5 =c)) + (call:SI + (mem:QI (match_operand 3 constant_call_address_operand z)) + (match_operand 4))) + (unspec:SI [(match_operand:SI 1 register_operand b) + (match_operand 2 tls_symbolic_operand)] + UNSPEC_TLS_GD) + (clobber (match_scratch:SI 5 =d)) + (clobber (match_scratch:SI 6 =c)) (clobber (reg:CC FLAGS_REG))] !TARGET_64BIT TARGET_GNU_TLS { @@ -12885,13 +12886,19 @@ (define_expand tls_global_dynamic_32 [(parallel [(set (match_operand:SI 0 register_operand) - (unspec:SI [(match_operand:SI 2 register_operand) - (match_operand 1
Re: [PATCH, PR58066] preferred_stack_boundary update for tls expanded call
My ia32 change generates much worse code: [hjl@gnu-6 gcc]$ cat /tmp/c.i static __thread char ccc, bbb; int __cxa_get_globals() { return ccc - bbb; } [hjl@gnu-6 gcc]$ ./xgcc -B./ -S -O2 -fPIC /tmp/c.i [hjl@gnu-6 gcc]$ cat c.s .file c.i .section .text.unlikely,ax,@progbits .LCOLDB0: .text .LHOTB0: .p2align 4,,15 .globl __cxa_get_globals .type __cxa_get_globals, @function __cxa_get_globals: .LFB0: .cfi_startproc subq $8, %rsp .cfi_def_cfa_offset 16 leaq ccc@tlsld(%rip), %rdi call __tls_get_addr@PLT addq $8, %rsp .cfi_def_cfa_offset 8 leaq ccc@dtpoff(%rax), %rcx leaq bbb@dtpoff(%rax), %rdx movq %rcx, %rax subq %rdx, %rax ret .cfi_endproc .LFE0: .size __cxa_get_globals, .-__cxa_get_globals .section .text.unlikely .LCOLDE0: .text .LHOTE0: .section .tbss,awT,@nobits .type bbb, @object .size bbb, 1 bbb: .zero 1 .type ccc, @object .size ccc, 1 ccc: .zero 1 .ident GCC: (GNU) 4.9.0 20140312 (experimental) .section .note.GNU-stack,,@progbits [hjl@gnu-6 gcc]$ cat /tmp/c.i static __thread char ccc, bbb; int __cxa_get_globals() { return ccc - bbb; } [hjl@gnu-6 gcc]$ ./xgcc -B./ -S -O2 -fPIC /tmp/c.i -m32 [hjl@gnu-6 gcc]$ cat c.s .file c.i .section .text.unlikely,ax,@progbits .LCOLDB0: .text .LHOTB0: .p2align 4,,15 .globl __cxa_get_globals .type __cxa_get_globals, @function __cxa_get_globals: .LFB0: .cfi_startproc pushl %esi .cfi_def_cfa_offset 8 .cfi_offset 6, -8 pushl %ebx .cfi_def_cfa_offset 12 .cfi_offset 3, -12 call __x86.get_pc_thunk.bx addl $_GLOBAL_OFFSET_TABLE_, %ebx subl $4, %esp .cfi_def_cfa_offset 16 leal ccc@tlsldm(%ebx), %eax call ___tls_get_addr@PLT leal ccc@dtpoff(%eax), %esi leal ccc@tlsldm(%ebx), %eax call ___tls_get_addr@PLT addl $4, %esp .cfi_def_cfa_offset 12 leal bbb@dtpoff(%eax), %eax popl %ebx .cfi_restore 3 .cfi_def_cfa_offset 8 subl %eax, %esi movl %esi, %eax popl %esi .cfi_restore 6 .cfi_def_cfa_offset 4 ret .cfi_endproc Maybe we should keep the original patterns and split them to add CALL. -- H.J. I tried pr58066-3.patch on the above testcase, the code it generated seems ok. I think after we change the 32bits pattern in i386.md to be similar as 64bits pattern, we should change 32bit expand to be similar as 64bit expand in legitimize_tls_address too? Thanks, Wei. ~/workarea/gcc-r208410-2/build/install/bin/gcc -m32 -S -fPIC 1.c .file 1.c .section.tbss,awT,@nobits .type ccc, @object .size ccc, 1 ccc: .zero 1 .type bbb, @object .size bbb, 1 bbb: .zero 1 .text .globl __cxa_get_globals .type __cxa_get_globals, @function __cxa_get_globals: .LFB0: .cfi_startproc pushl %ebp .cfi_def_cfa_offset 8 .cfi_offset 5, -8 movl%esp, %ebp .cfi_def_cfa_register 5 pushl %esi pushl %ebx .cfi_offset 6, -12 .cfi_offset 3, -16 call__x86.get_pc_thunk.bx addl$_GLOBAL_OFFSET_TABLE_, %ebx lealccc@tlsgd(,%ebx,1), %eax call___tls_get_addr@PLT movl%eax, %esi lealbbb@tlsgd(,%ebx,1), %eax call___tls_get_addr@PLT subl%eax, %esi movl%esi, %eax popl%ebx .cfi_restore 3 popl%esi .cfi_restore 6 popl%ebp .cfi_restore 5 .cfi_def_cfa 4, 4 ret .cfi_endproc .LFE0: .size __cxa_get_globals, .-__cxa_get_globals .section .text.__x86.get_pc_thunk.bx,axG,@progbits,__x86.get_pc_thunk.bx,comdat .globl __x86.get_pc_thunk.bx .hidden __x86.get_pc_thunk.bx .type __x86.get_pc_thunk.bx, @function __x86.get_pc_thunk.bx: .LFB1: .cfi_startproc movl(%esp), %ebx ret .cfi_endproc .LFE1: .ident GCC: (GNU) 4.9.0 20140307 (experimental) .section.note.GNU-stack,,@progbits
Re: [patch,avr] Fix PR59396: Ignore leading '*' in warning generation for ISR names
On Thu, Mar 13, 2014 at 02:24:06PM +0100, Georg-Johann Lay wrote: Problem is that the assembler name might or might not be prefixed by '*' depending on when TARGET_SET_CURRENT_FUNCTION is called. The change is just to fix wrong warning because the current implementation of TARGET_SET_CURRENT_FUNCTION /always/ skips the first char when the assembler name is set. FWIW, there's default_strip_name_encoding (varasm.c), which does the same thing, and is used by a couple of other targets. Regards Senthil PR target/59396 * config/avr/avr.c (avr_set_current_function): Skip the first char of the (assembler) name provided it's a '*'. Index: config/avr/avr.c === --- config/avr/avr.c (revision 208532) +++ config/avr/avr.c (working copy) @@ -600,10 +600,15 @@ avr_set_current_function (tree decl) const char *name; name = DECL_ASSEMBLER_NAME_SET_P (decl) -/* Remove the leading '*' added in set_user_assembler_name. */ -? 1 + IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)) +? IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)) : IDENTIFIER_POINTER (DECL_NAME (decl)); + /* Skip a leading '*' that might still prefix the assembler name, + e.g. in non-LTO runs. */ + + if (*name == '*') +name++; + /* Silently ignore 'signal' if 'interrupt' is present. AVR-LibC startet using this when it switched from SIGNAL and INTERRUPT to ISR. */
[Ada] Fix PR ada/51483
This fixes a flaw in the mechanism implemented to register modes and types declared in the back-end with the front-end. The mechanism was implicitly making the assumption that it is possible to deduce the size of a FP mode from its precision and alignment; that's wrong in the general case, although exceptions to the rule are quite rare (typically for an IEEE extended mode). This changes the registration interface to accept a new 'precision' parameter in addition to the 'size' and align both notions in the back-end and in the front-end (the back-end precision was previously passed as the front-end size and the back-end size was second-guessed by the front-end). No functional changes on already working platforms. Tested on x86_64-suse-linux, applied on the mainline and, for a simplified version, on the 4.8 and 4.7 branches. 2014-03-13 Eric Botcazou ebotca...@adacore.com PR ada/51483 * cstand.adb (Register_Float_Type): Add 'precision' parameter and use it to set the RM size. Use directly 'size' for the Esize. (Create_Back_End_Float_Types): Adjust call to above. * get_targ.ads (Register_Type_Proc): Add 'precision' parameter. * set_targ.ads (FPT_Mode_Entry): Add 'precision' component. (Write_Target_Dependent_Values): Adjust comment. * set_targ.adb (Register_Float_Type): Add 'precision' parameter and deal with it. (Write_Target_Dependent_Values): Write the precision in lieu of size. (Initialization): Read the precision in lieu of size and compute the size from the precision and the alignment. * gcc-interface/gigi.h (enumerate_modes): Add integer parameter. * gcc-interface/misc.c (enumerate_modes): Likewise. Do not register types for vector modes, pass the size in addition to the precision. -- Eric BotcazouIndex: get_targ.ads === --- get_targ.ads (revision 208528) +++ get_targ.ads (working copy) @@ -6,7 +6,7 @@ -- -- -- S p e c -- -- -- --- Copyright (C) 1992-2013, Free Software Foundation, Inc. -- +-- Copyright (C) 1992-2014, Free Software Foundation, Inc. -- -- -- -- GNAT is free software; you can redistribute it and/or modify it under -- -- terms of the GNU General Public License as published by the Free Soft- -- @@ -28,8 +28,8 @@ -- exp_dbug and the elaboration of ttypes, via the Set_Targs package. -- It also contains the routine for registering floating-point types. --- NOTE: Any changes in this package must be reflected in jgettarg.ads --- and aa_getta.ads and any other versions of this package. +-- NOTE: Any changes in this package must be reflected in aa_getta.adb +-- and any other version in the various back ends. -- Note that all these values return sizes of C types with corresponding -- names. This allows GNAT to define the corresponding Ada types to have @@ -134,6 +134,7 @@ package Get_Targ is Complex : Boolean;-- True iff type has real and imaginary parts Count : Natural;-- Number of elements in vector, 0 otherwise Float_Rep : Float_Rep_Kind; -- Representation used for fpt type + Precision : Positive; -- Precision of representation in bits Size : Positive; -- Size of representation in bits Alignment : Natural); -- Required alignment in bits pragma Convention (C, Register_Type_Proc); Index: cstand.adb === --- cstand.adb (revision 208528) +++ cstand.adb (working copy) @@ -6,7 +6,7 @@ -- -- -- B o d y -- -- -- --- Copyright (C) 1992-2013, Free Software Foundation, Inc. -- +-- Copyright (C) 1992-2014, Free Software Foundation, Inc. -- -- -- -- GNAT is free software; you can redistribute it and/or modify it under -- -- terms of the GNU General Public License as published by the Free Soft- -- @@ -158,6 +158,7 @@ package body CStand is (Name : String; Digs : Positive; Float_Rep : Float_Rep_Kind; + Precision : Positive; Size : Positive; Alignment : Natural); -- Registers a single back end floating-point type (from FPT_Mode_Table in @@ -167,7 +168,8 @@ package body CStand is -- as a normal format (non-null-terminated) string. Digs is the
patch to fix PR57189
The following patch fixes http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57189 The patch was successfully bootstrapped and tested on x86-64. Committed as rev. 208549. 2014-03-13 Vladimir Makarov vmaka...@redhat.com PR rtl-optimization/57189 * lra-constraints.c (process_alt_operands): Disfavor spilling vector pseudos. 2014-03-13 Vladimir Makarov vmaka...@redhat.com PR rtl-optimization/57189 * gcc.target/i386/pr57189.c: New. --- a/gcc/lra-constraints.c +++ b/gcc/lra-constraints.c @@ -2302,9 +2302,20 @@ process_alt_operands (int only_alternative) if (lra_dump_file != NULL) fprintf (lra_dump_file, - %d Spill pseudo in memory: reject+=3\n, + %d Spill pseudo into memory: reject+=3\n, nop); reject += 3; + if (VECTOR_MODE_P (mode)) + { + /* Spilling vectors into memory is usually more + costly as they contain big values. */ + if (lra_dump_file != NULL) + fprintf + (lra_dump_file, + %d Spill vector pseudo: reject+=2\n, + nop); + reject += 2; + } } #ifdef SECONDARY_MEMORY_NEEDED diff --git a/gcc/testsuite/gcc.target/i386/pr57189.c b/gcc/testsuite/gcc.target/i386/pr57189.c new file mode 100644 index 000..389052c --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr57189.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -msse2 -march=k8 } */ +/* { dg-final { scan-assembler-not movaps } } */ + +typedef int __v4si __attribute__ ((__vector_size__ (16))); + +int test (__v4si __A) +{ + return __builtin_ia32_vec_ext_v4si (__A, 0); +}
[PATCH, PR 60461] Fix loop condition at the end of ipa_modify_call_arguments
Hi, the reference re-building part of ipa_modify_call_arguments has a stupid bug in the while loop condition, which means it does not look at statements produced by force_gimple_operand_gsi which may lead to bugs such as PR 60461. The 4.8 branch does not have this code and does not exhibit the bug so for th time being I'm leaving the code alone there. I have bootstrapped and tested the following patch on x86_64-linux on trunk and will commit it there shortly as obvious. Thanks, Martin 2014-03-13 Martin Jambor mjam...@suse.cz PR lto/60461 * ipa-prop.c (ipa_modify_call_arguments): Fix iteration condition. testsuite/ * gcc.dg/lto/pr60461_0.c: New test. diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c index 4fb916a..efe8c7a 100644 --- a/gcc/ipa-prop.c +++ b/gcc/ipa-prop.c @@ -3901,7 +3901,7 @@ ipa_modify_call_arguments (struct cgraph_edge *cs, gimple stmt, gsi_prev (gsi); } while ((gsi_end_p (prev_gsi) !gsi_end_p (gsi)) -|| (!gsi_end_p (prev_gsi) gsi_stmt (gsi) == gsi_stmt (prev_gsi))); +|| (!gsi_end_p (prev_gsi) gsi_stmt (gsi) != gsi_stmt (prev_gsi))); } /* If the expression *EXPR should be replaced by a reduction of a parameter, do diff --git a/gcc/testsuite/gcc.dg/lto/pr60461_0.c b/gcc/testsuite/gcc.dg/lto/pr60461_0.c new file mode 100644 index 000..cad6a8d --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr60461_0.c @@ -0,0 +1,37 @@ +/* { dg-lto-do link } */ +/* { dg-lto-options {{-Os -flto} } } */ + + +struct S +{ + int f1; + int f2; +} a[1] = { {0, 0} }; + +int b, c; + +static unsigned short fn1 (struct S); + +void +fn2 () +{ + for (; c;) +; + b = 0; + fn1 (a[0]); +} + +unsigned short +fn1 (struct S p) +{ + if (p.f1) +fn2 (); + return 0; +} + +int +main () +{ + fn2 (); + return 0; +}
Re: [PATCH, PR 60461] Fix loop condition at the end of ipa_modify_call_arguments
On Thu, Mar 13, 2014 at 04:56:12PM +0100, Martin Jambor wrote: PR lto/60461 * ipa-prop.c (ipa_modify_call_arguments): Fix iteration condition. testsuite/ * gcc.dg/lto/pr60461_0.c: New test. diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c index 4fb916a..efe8c7a 100644 --- a/gcc/ipa-prop.c +++ b/gcc/ipa-prop.c @@ -3901,7 +3901,7 @@ ipa_modify_call_arguments (struct cgraph_edge *cs, gimple stmt, gsi_prev (gsi); } while ((gsi_end_p (prev_gsi) !gsi_end_p (gsi)) - || (!gsi_end_p (prev_gsi) gsi_stmt (gsi) == gsi_stmt (prev_gsi))); + || (!gsi_end_p (prev_gsi) gsi_stmt (gsi) != gsi_stmt (prev_gsi))); } Doesn't (in 4.8+ of course) gsi_stmt return NULL iff gsi_end_p? Thus, can't this be simplified into: } while (gsi_stmt (gsi) != gsi_stmt (prev_gsi)); ? Jakub
Re: [Ada] Fix PR ada/51483
On Mar 13, 2014, at 11:36, Eric Botcazou ebotca...@adacore.com wrote: This fixes a flaw in the mechanism implemented to register modes and types declared in the back-end with the front-end. The mechanism was implicitly making the assumption that it is possible to deduce the size of a FP mode from its precision and alignment; that's wrong in the general case, although exceptions to the rule are quite rare (typically for an IEEE extended mode). Thanks, Eric! -Geert
Re: RFA: New ipa-devirt PATCH for c++/58678 (devirt vs. KDE)
On 03/11/2014 05:08 PM, Jan Hubicka wrote: Jason, I was looking into this and I think I have patch that works. I would just like to verify I inderstnad things right. First thing I implemented is to consistently skip dtors of abstract classes as per the comment in abstract_class_dtor_p there is no way to call those by virtual table pointer. Unlike your patch it will i.e. enable better unreachable code removal since they will not appear in possible target lists of polymorphic calls. Makes sense. The second change I did is to move methods that are reachable only via abstract class into the part of list that is in construction, since obviously we do not have instances of these classes. I'm not sure how you would tell that a method that is reachable only via abstract class; a derived class doesn't have to override methods other than the destructor, so we could get the abstract class method for an object of a derived class. Yep, this can apply only to anonymous namespace types where we know the derivations. So it is not a big win. What I would like to verify with you shtat I also changed walk when looking for destructors to not consider types in construction. I believe there is no way to get destructor call via construction vtable as we always know the type. Is that right? I guess it would be possible to get the abstract destructor via construction vtable if someone deletes the object while it's being constructed. But surely that's undefined behavior anyway. also if abstract_class_dtor_p functions are never called via vtables, is there reason for C++ FE to put them there? I understand that there is a slot in vtable initializer for them, but things would go smoother if it was initialized to NULL or some other marker different from cxa_pure_virtual. Then gimple-fold will already substitute it for builtin_unreachable and they will get ignored during the ipa-devirt's walks. Hmm, interesting idea. Shall I implement that? Less we have in the vtables, the better, so if you can implement this, it would be great. Honza Jason
[C++ PATCH] [gomp4] Initial OpenACC support to C++ front-end
On 07.03.2014 15:37, Ilmir Usmanov wrote: Hi Thomas! I prepared simple patch to add support of OpenACC data, kernels and parallel constructs to C++ FE. It adds support of data clauses too. OK to gomp4 branch? Fixed subject: changed file extensions of tests and fixed comments. OK to gomp4 branch? -- Ilmir. From 8368b5196c1201401e1f8301107f11c9e6f064b1 Mon Sep 17 00:00:00 2001 From: Ilmir Usmanov i.usma...@samsung.com Date: Thu, 13 Mar 2014 20:59:34 +0400 Subject: [PATCH] Initial OpenACC support to C++ FE --- Initial OpenACC support to C++ front-end: parallel, kernels and data construct with data clauses. gcc/cp/ * cp-tree.h (finish_oacc_data): New function prototype. (finish_oacc_kernels, finish_oacc_parallel): Likewise. * parser.c (cp_parser_omp_clause_name): Support data clauses. (cp_parser_oacc_data_clause): New function. (cp_parser_oacc_data_clause_deviceptr, cp_parser_oacc_all_clauses, cp_parser_oacc_data, cp_parser_oacc_kernels, cp_parser_oacc_parallel): Likewise. (OACC_DATA_CLAUSE_MASK): New define. (OACC_KERNELS_CLAUSE_MASK, OACC_PARALLEL_CLAUSE_MASK): Likewise. (cp_parser_omp_construct, cp_parser_pragma): Support OpenACC directives. * semantics.c (finish_oacc_data): New function. (finish_oacc_kernels, finish_oacc_parallel): Likewise. * gcc/testsuite/c-c++-common/goacc/deviceptr-1.c: Move to ... * gcc/testsuite/gcc.dg/goacc/deviceptr-1.c ... here. * gcc/testsuite/g++.dg/goacc/goacc.exp: New test directory. * gcc/testsuite/g++.dg/goacc-gomp/goacc-gomp.exp: Likewise. gcc/testsuite/g++.dg/goacc/ * deviceptr-1.cpp: New test. * sb-1.cpp: Likewise. * sb-2.cpp: Likewise. diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index 8ec7d6a..9b966d2 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -5825,6 +5825,9 @@ extern tree finish_omp_clauses (tree); extern void finish_omp_threadprivate (tree); extern tree begin_omp_structured_block (void); extern tree finish_omp_structured_block (tree); +extern tree finish_oacc_data (tree, tree); +extern tree finish_oacc_kernels (tree, tree); +extern tree finish_oacc_parallel (tree, tree); extern tree begin_omp_parallel (void); extern tree finish_omp_parallel (tree, tree); extern tree begin_omp_task (void); diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index 8c167c7..94d9e22 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -26935,16 +26935,26 @@ cp_parser_omp_clause_name (cp_parser *parser) case 'c': if (!strcmp (collapse, p)) result = PRAGMA_OMP_CLAUSE_COLLAPSE; + else if (!strcmp (copy, p)) + result = PRAGMA_OMP_CLAUSE_COPY; else if (!strcmp (copyin, p)) result = PRAGMA_OMP_CLAUSE_COPYIN; + else if (!strcmp (copyout, p)) + result = PRAGMA_OMP_CLAUSE_COPYOUT; else if (!strcmp (copyprivate, p)) result = PRAGMA_OMP_CLAUSE_COPYPRIVATE; + else if (!strcmp (create, p)) + result = PRAGMA_OMP_CLAUSE_CREATE; break; case 'd': - if (!strcmp (depend, p)) + if (!strcmp (delete, p)) + result = PRAGMA_OMP_CLAUSE_DELETE; + else if (!strcmp (depend, p)) result = PRAGMA_OMP_CLAUSE_DEPEND; else if (!strcmp (device, p)) result = PRAGMA_OMP_CLAUSE_DEVICE; + else if (!strcmp (deviceptr, p)) + result = PRAGMA_OMP_CLAUSE_DEVICEPTR; else if (!strcmp (dist_schedule, p)) result = PRAGMA_OMP_CLAUSE_DIST_SCHEDULE; break; @@ -26993,6 +27003,22 @@ cp_parser_omp_clause_name (cp_parser *parser) case 'p': if (!strcmp (parallel, p)) result = PRAGMA_OMP_CLAUSE_PARALLEL; + else if (!strcmp (present, p)) + result = PRAGMA_OMP_CLAUSE_PRESENT; + else if (!strcmp (present_or_copy, p) + || !strcmp (pcopy, p)) + result = PRAGMA_OMP_CLAUSE_PRESENT_OR_COPY; + else if (!strcmp (present_or_copyin, p) + || !strcmp (pcopyin, p)) + result = PRAGMA_OMP_CLAUSE_PRESENT_OR_COPYIN; + else if (!strcmp (present_or_copyout, p) + || !strcmp (pcopyout, p)) + result = PRAGMA_OMP_CLAUSE_PRESENT_OR_COPYOUT; + else if (!strcmp (present_or_create, p) + || !strcmp (pcreate, p)) + result = PRAGMA_OMP_CLAUSE_PRESENT_OR_CREATE; + else if (!strcmp (private, p)) + result = PRAGMA_OMP_CLAUSE_PRIVATE; else if (!strcmp (proc_bind, p)) result = PRAGMA_OMP_CLAUSE_PROC_BIND; break; @@ -27200,6 +27226,111 @@ cp_parser_omp_var_list (cp_parser *parser, enum omp_clause_code kind, tree list) return list; } +/* OpenACC 2.0: + copy ( variable-list ) + copyin ( variable-list ) + copyout ( variable-list ) + create ( variable-list ) + delete ( variable-list ) + present ( variable-list ) + present_or_copy ( variable-list ) + pcopy ( variable-list ) + present_or_copyin ( variable-list ) + pcopyin ( variable-list ) + present_or_copyout ( variable-list ) + pcopyout ( variable-list ) + present_or_create ( variable-list ) + pcreate ( variable-list ) */ + +static tree +cp_parser_oacc_data_clause (cp_parser *parser, pragma_omp_clause c_kind, + tree list) +{ + enum
Re: [PATCH, libiberty]: Avoid 'right-hand operand of comma expression has no effect' when compiling regex.c
On Thu, Mar 13, 2014 at 3:36 AM, Uros Bizjak ubiz...@gmail.com wrote: Attached patch changes the return value of the bzero macro to void, as defined in a 4.3BSD: void bzero(void *s, size_t n); As an additional benefit, the changed macro now generates warning when its return value is used (which is *not* the case in regex.c): I'm not worried about anybody using the return value incorrectly in this file. I think we should just # define bzero(s, n) memset (s, '\0', n) I'll approve that change if it works. Ian
Re: [PATCH, PR 60461] Fix loop condition at the end of ipa_modify_call_arguments
On Thu, Mar 13, 2014 at 05:19:02PM +0100, Jakub Jelinek wrote: On Thu, Mar 13, 2014 at 04:56:12PM +0100, Martin Jambor wrote: PR lto/60461 * ipa-prop.c (ipa_modify_call_arguments): Fix iteration condition. testsuite/ * gcc.dg/lto/pr60461_0.c: New test. diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c index 4fb916a..efe8c7a 100644 --- a/gcc/ipa-prop.c +++ b/gcc/ipa-prop.c @@ -3901,7 +3901,7 @@ ipa_modify_call_arguments (struct cgraph_edge *cs, gimple stmt, gsi_prev (gsi); } while ((gsi_end_p (prev_gsi) !gsi_end_p (gsi)) -|| (!gsi_end_p (prev_gsi) gsi_stmt (gsi) == gsi_stmt (prev_gsi))); +|| (!gsi_end_p (prev_gsi) gsi_stmt (gsi) != gsi_stmt (prev_gsi))); } Doesn't (in 4.8+ of course) gsi_stmt return NULL iff gsi_end_p? Thus, can't this be simplified into: } while (gsi_stmt (gsi) != gsi_stmt (prev_gsi)); ? Apparently it does. I will commit the simplified condition after testing then. Thanks, Martin
[PATCH] Fix overflows in get_ref_base_and_extent (PR tree-optimization/59779)
Hi! The outer-1.c testcase apparently fails on 32-bit HWI targets, the problem is that the int x[1][1]; array has bitsize that fits into uhwi, but not shwi, so we get negative maxsize that isn't -1. After discussions with Richard on IRC, I've implemented computation of bitsize and maxsize in double_int. Bootstrapped/regtested on x86_64-linux and i686-linux, with bootstrap/regtest time changes in the noise. Ok for trunk? John, could you please test this on some 32-bit HWI target with bootstrap/regtest? 2014-03-13 Jakub Jelinek ja...@redhat.com PR tree-optimization/59779 * tree-dfa.c (get_ref_base_and_extent): Use double_int type for bitsize and maxsize instead of HOST_WIDE_INT. --- gcc/tree-dfa.c.jj 2014-01-03 11:40:57.0 +0100 +++ gcc/tree-dfa.c 2014-03-13 12:10:46.367886640 +0100 @@ -389,11 +389,10 @@ get_ref_base_and_extent (tree exp, HOST_ HOST_WIDE_INT *psize, HOST_WIDE_INT *pmax_size) { - HOST_WIDE_INT bitsize = -1; - HOST_WIDE_INT maxsize = -1; + double_int bitsize = double_int_minus_one; + double_int maxsize; tree size_tree = NULL_TREE; double_int bit_offset = double_int_zero; - HOST_WIDE_INT hbit_offset; bool seen_variable_array_ref = false; /* First get the final access size from just the outermost expression. */ @@ -407,15 +406,11 @@ get_ref_base_and_extent (tree exp, HOST_ if (mode == BLKmode) size_tree = TYPE_SIZE (TREE_TYPE (exp)); else - bitsize = GET_MODE_BITSIZE (mode); -} - if (size_tree != NULL_TREE) -{ - if (! tree_fits_uhwi_p (size_tree)) - bitsize = -1; - else - bitsize = tree_to_uhwi (size_tree); + bitsize = double_int::from_uhwi (GET_MODE_BITSIZE (mode)); } + if (size_tree != NULL_TREE + TREE_CODE (size_tree) == INTEGER_CST) +bitsize = tree_to_double_int (size_tree); /* Initially, maxsize is the same as the accessed element size. In the following it will only grow (or become -1). */ @@ -448,7 +443,7 @@ get_ref_base_and_extent (tree exp, HOST_ referenced the last field of a struct or a union member then we have to adjust maxsize by the padding at the end of our field. */ - if (seen_variable_array_ref maxsize != -1) + if (seen_variable_array_ref !maxsize.is_minus_one ()) { tree stype = TREE_TYPE (TREE_OPERAND (exp, 0)); tree next = DECL_CHAIN (field); @@ -459,15 +454,22 @@ get_ref_base_and_extent (tree exp, HOST_ { tree fsize = DECL_SIZE_UNIT (field); tree ssize = TYPE_SIZE_UNIT (stype); - if (tree_fits_shwi_p (fsize) -tree_fits_shwi_p (ssize) -doffset.fits_shwi ()) - maxsize += ((tree_to_shwi (ssize) - - tree_to_shwi (fsize)) - * BITS_PER_UNIT - - doffset.to_shwi ()); + if (fsize == NULL + || TREE_CODE (fsize) != INTEGER_CST + || ssize == NULL + || TREE_CODE (ssize) != INTEGER_CST) + maxsize = double_int_minus_one; else - maxsize = -1; + { + double_int tem = tree_to_double_int (ssize) +- tree_to_double_int (fsize); + if (BITS_PER_UNIT == 8) + tem = tem.lshift (3); + else + tem *= double_int::from_uhwi (BITS_PER_UNIT); + tem -= doffset; + maxsize += tem; + } } } } @@ -477,13 +479,12 @@ get_ref_base_and_extent (tree exp, HOST_ /* We need to adjust maxsize to the whole structure bitsize. But we can subtract any constant offset seen so far, because that would get us out of the structure otherwise. */ - if (maxsize != -1 + if (!maxsize.is_minus_one () csize -tree_fits_uhwi_p (csize) -bit_offset.fits_shwi ()) - maxsize = tree_to_uhwi (csize) - bit_offset.to_shwi (); +TREE_CODE (csize) == INTEGER_CST) + maxsize = tree_to_double_int (csize) - bit_offset; else - maxsize = -1; + maxsize = double_int_minus_one; } } break; @@ -520,13 +521,12 @@ get_ref_base_and_extent (tree exp, HOST_
Re: [PATCH, PR58066] preferred_stack_boundary update for tls expanded call
Can we combine the last two patches, both adding call explicitly in rtl template for tls_local_dynamic_base_32/tls_global_dynamic_32, and set ix86_tls_descriptor_calls_expanded_in_cfun to true only after reload complete? Hi H.J. I attached the patch which combined your two patches and the fix in legitimize_tls_address. I tried pr58066.c and c.i in ia32/x32/x86_64, the code looked fine. Do you think it is ok? Thanks, Wei. Index: config/i386/i386.c === --- config/i386/i386.c (revision 208410) +++ config/i386/i386.c (working copy) @@ -9082,7 +9082,7 @@ ix86_frame_pointer_required (void) we've not got a leaf function. */ if (TARGET_OMIT_LEAF_FRAME_POINTER (!crtl-is_leaf - || ix86_current_function_calls_tls_descriptor)) + || ix86_tls_descriptor_calls_expanded_in_cfun)) return true; if (crtl-profile !flag_fentry) @@ -9331,7 +9331,7 @@ ix86_select_alt_pic_regnum (void) { if (crtl-is_leaf !crtl-profile - !ix86_current_function_calls_tls_descriptor) + !ix86_tls_descriptor_calls_expanded_in_cfun) { int i, drap; /* Can't use the same register for both PIC and DRAP. */ @@ -9490,20 +9490,28 @@ ix86_compute_frame_layout (struct ix86_f frame-nregs = ix86_nsaved_regs (); frame-nsseregs = ix86_nsaved_sseregs (); - stack_alignment_needed = crtl-stack_alignment_needed / BITS_PER_UNIT; - preferred_alignment = crtl-preferred_stack_boundary / BITS_PER_UNIT; - /* 64-bit MS ABI seem to require stack alignment to be always 16 except for function prologues and leaf. */ - if ((TARGET_64BIT_MS_ABI preferred_alignment 16) + if ((TARGET_64BIT_MS_ABI crtl-preferred_stack_boundary 128) (!crtl-is_leaf || cfun-calls_alloca != 0 - || ix86_current_function_calls_tls_descriptor)) + || ix86_tls_descriptor_calls_expanded_in_cfun)) { - preferred_alignment = 16; - stack_alignment_needed = 16; crtl-preferred_stack_boundary = 128; crtl-stack_alignment_needed = 128; } + /* preferred_stack_boundary is never updated for call expanded from + tls descriptor. Update it here. We don't update it in expand stage + because tls calls may be optimized away. */ + else if (ix86_tls_descriptor_calls_expanded_in_cfun + crtl-preferred_stack_boundary PREFERRED_STACK_BOUNDARY) +{ + crtl-preferred_stack_boundary = PREFERRED_STACK_BOUNDARY; + if (crtl-stack_alignment_needed PREFERRED_STACK_BOUNDARY) + crtl-stack_alignment_needed = PREFERRED_STACK_BOUNDARY; +} + + stack_alignment_needed = crtl-stack_alignment_needed / BITS_PER_UNIT; + preferred_alignment = crtl-preferred_stack_boundary / BITS_PER_UNIT; gcc_assert (!size || stack_alignment_needed); gcc_assert (preferred_alignment = STACK_BOUNDARY / BITS_PER_UNIT); @@ -9608,7 +9616,7 @@ ix86_compute_frame_layout (struct ix86_f || size != 0 || !crtl-is_leaf || cfun-calls_alloca - || ix86_current_function_calls_tls_descriptor) + || ix86_tls_descriptor_calls_expanded_in_cfun) offset = (offset + stack_alignment_needed - 1) -stack_alignment_needed; /* Frame pointer points here. */ @@ -9623,7 +9631,7 @@ ix86_compute_frame_layout (struct ix86_f of stack frame are unused. */ if (ACCUMULATE_OUTGOING_ARGS (!crtl-is_leaf || cfun-calls_alloca - || ix86_current_function_calls_tls_descriptor)) + || ix86_tls_descriptor_calls_expanded_in_cfun)) { offset += crtl-outgoing_args_size; frame-outgoing_arguments_size = crtl-outgoing_args_size; @@ -9634,7 +9642,7 @@ ix86_compute_frame_layout (struct ix86_f /* Align stack boundary. Only needed if we're calling another function or using alloca. */ if (!crtl-is_leaf || cfun-calls_alloca - || ix86_current_function_calls_tls_descriptor) + || ix86_tls_descriptor_calls_expanded_in_cfun) offset = (offset + preferred_alignment - 1) -preferred_alignment; /* We've reached end of stack frame. */ @@ -9650,7 +9658,7 @@ ix86_compute_frame_layout (struct ix86_f if (ix86_using_red_zone () crtl-sp_is_unchanging crtl-is_leaf - !ix86_current_function_calls_tls_descriptor) + !ix86_tls_descriptor_calls_expanded_in_cfun) { frame-red_zone_size = to_allocate; if (frame-save_regs_using_mov) @@ -10623,7 +10631,7 @@ ix86_finalize_stack_realign_flags (void) crtl-is_leaf flag_omit_frame_pointer crtl-sp_is_unchanging - !ix86_current_function_calls_tls_descriptor + !ix86_tls_descriptor_calls_expanded_in_cfun !crtl-accesses_prior_frames !cfun-calls_alloca !crtl-calls_eh_return @@ -13437,26 +13445,25 @@ legitimize_tls_address (rtx x, enum tls_ else { rtx caddr = ix86_tls_get_addr (); + rtx ax = gen_rtx_REG (Pmode, AX_REG); + rtx insns; + start_sequence (); if (TARGET_64BIT) - { -
[PATCH] Fix up #pragma weak handling (PR middle-end/36282)
Hi! We get bogus warning on the -1 and -4 testcases. The problem is that we accept without warning an __asm rename like: extern void *baz (void *dest, const void *src, __SIZE_TYPE__ n); extern __typeof (baz) baz __asm(bazfn); only if it doesn't have DECL_ASSEMBLER_NAME_SET_P yet, after it is set we just warn and ignore the rename. But for #pragma weak, if there are any pending #pragma weak pragmas, we actually set DECL_ASSEMBLER_NAME right away. Fixed by computing DECL_ASSEMBLER_NAME for the #pragma weak handling just temporarily if it hasn't been set yet (yeah, it is duplicate work then, but hopefully not very common), plus also fixing the early outs - if a decl matching some #pragma weak is found, that vector entry is removed from pending_weaks, but we were only testing if pending_weaks is NULL, which it will never be after it has been allocated once. Thus, for the most common case (well, most code doesn't use #pragma weak at all) where you have #pragma weak followed by corresponding prototype or say a couple of #pragma weak directives followed by corresponding prototypes, there should be either no or just very small amount of extra decl_assembler_name invocations. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-03-13 Jakub Jelinek ja...@redhat.com PR middle-end/36282 * c-pragma.c (apply_pragma_weak): Only look at TREE_SYMBOL_REFERENCED (DECL_ASSEMBLER_NAME (decl)) if DECL_ASSEMBLER_NAME_SET_P (decl). (maybe_apply_pending_pragma_weaks): Exit early if vec_safe_is_empty (pending_weaks) rather than only when !pending_weaks. (maybe_apply_pragma_weak): Likewise. If !DECL_ASSEMBLER_NAME_SET_P, set assembler name back to NULL afterwards. * c-c++-common/pr36282-1.c: New test. * c-c++-common/pr36282-2.c: New test. * c-c++-common/pr36282-3.c: New test. * c-c++-common/pr36282-4.c: New test. --- gcc/c-family/c-pragma.c.jj 2014-03-06 13:05:17.0 +0100 +++ gcc/c-family/c-pragma.c 2014-03-13 14:39:44.200586561 +0100 @@ -263,6 +263,7 @@ apply_pragma_weak (tree decl, tree value if (SUPPORTS_WEAK DECL_EXTERNAL (decl) TREE_USED (decl) !DECL_WEAK (decl) /* Don't complain about a redundant #pragma. */ + DECL_ASSEMBLER_NAME_SET_P (decl) TREE_SYMBOL_REFERENCED (DECL_ASSEMBLER_NAME (decl))) warning (OPT_Wpragmas, applying #pragma weak %q+D after first use results in unspecified behavior, decl); @@ -280,7 +281,7 @@ maybe_apply_pragma_weak (tree decl) /* Avoid asking for DECL_ASSEMBLER_NAME when it's not needed. */ /* No weak symbols pending, take the short-cut. */ - if (!pending_weaks) + if (vec_safe_is_empty (pending_weaks)) return; /* If it's not visible outside this file, it doesn't matter whether it's weak. */ @@ -292,7 +293,13 @@ maybe_apply_pragma_weak (tree decl) if (TREE_CODE (decl) != FUNCTION_DECL TREE_CODE (decl) != VAR_DECL) return; - id = DECL_ASSEMBLER_NAME (decl); + if (DECL_ASSEMBLER_NAME_SET_P (decl)) +id = DECL_ASSEMBLER_NAME (decl); + else +{ + id = DECL_ASSEMBLER_NAME (decl); + SET_DECL_ASSEMBLER_NAME (decl, NULL_TREE); +} FOR_EACH_VEC_ELT (*pending_weaks, i, pe) if (id == pe-name) @@ -313,7 +320,7 @@ maybe_apply_pending_pragma_weaks (void) pending_weak *pe; symtab_node *target; - if (!pending_weaks) + if (vec_safe_is_empty (pending_weaks)) return; FOR_EACH_VEC_ELT (*pending_weaks, i, pe) --- gcc/testsuite/c-c++-common/pr36282-1.c.jj 2014-03-13 14:31:03.752580696 +0100 +++ gcc/testsuite/c-c++-common/pr36282-1.c 2014-03-13 14:31:56.110270219 +0100 @@ -0,0 +1,12 @@ +/* PR middle-end/36282 */ +/* { dg-do compile } */ + +#pragma weak bar + +extern void *baz (void *dest, const void *src, __SIZE_TYPE__ n); +extern __typeof (baz) baz __asm(bazfn); /* { dg-bogus asm declaration ignored due to conflict with previous rename } */ + +void +foo (void) +{ +} --- gcc/testsuite/c-c++-common/pr36282-2.c.jj 2014-03-13 14:31:03.752580696 +0100 +++ gcc/testsuite/c-c++-common/pr36282-2.c 2014-03-13 14:32:01.264247933 +0100 @@ -0,0 +1,10 @@ +/* PR middle-end/36282 */ +/* { dg-do compile } */ + +extern void *baz (void *dest, const void *src, __SIZE_TYPE__ n); +extern __typeof (baz) baz __asm(bazfn); /* { dg-bogus asm declaration ignored due to conflict with previous rename } */ + +void +foo (void) +{ +} --- gcc/testsuite/c-c++-common/pr36282-3.c.jj 2014-03-13 14:31:03.752580696 +0100 +++ gcc/testsuite/c-c++-common/pr36282-3.c 2014-03-13 14:32:07.243209260 +0100 @@ -0,0 +1,13 @@ +/* PR middle-end/36282 */ +/* { dg-do compile } */ + +void bar (void); +#pragma weak bar + +extern void *baz (void *dest, const void *src, __SIZE_TYPE__ n); +extern __typeof (baz) baz __asm(bazfn); /* { dg-bogus asm declaration ignored due to conflict with previous rename } */ + +void +foo (void) +{ +} ---
Re: [PATCH, PR58066] preferred_stack_boundary update for tls expanded call
On Thu, Mar 13, 2014 at 10:55 AM, Wei Mi w...@google.com wrote: Can we combine the last two patches, both adding call explicitly in rtl template for tls_local_dynamic_base_32/tls_global_dynamic_32, and set ix86_tls_descriptor_calls_expanded_in_cfun to true only after reload complete? Hi H.J. I attached the patch which combined your two patches and the fix in legitimize_tls_address. I tried pr58066.c and c.i in ia32/x32/x86_64, the code looked fine. Do you think it is ok? Thanks, Wei. Either pr58066-3.patch or pr58066-4.patch looks good to me. Thanks. -- H.J.
Re: [patch] Cleanup the CFG after pro_and_epilogue pass (PR rtl-optimization/57320)
On Fri, May 17, 2013 at 03:59:07PM -0600, Jeff Law wrote: On 05/17/2013 03:53 PM, Steven Bosscher wrote: On Fri, May 17, 2013 at 11:16 PM, Jeff Law wrote: What's happened, is that emitting the epilogue at the end of basic block 4 (with a barrier at the end) has made the use insn 43 unreachable. But from the description you've given, it appears that the epilogue itself has unreachable code, and that shouldn't be happening. If you think it can happen by way of shrink-wrapping, I'd like to see the analysis. It is not the epilogue itself but the way shrink-wrapping emits it. The block that is unreachable has its last predecessor edge removed in function.c:6607: 6607 redirect_edge_and_branch_force (e, *pdest_bb); I haven't looked at how the shrink-wrapping code works exactly. It's Bernd's code, so perhaps he can have a look. This is now PR57320. OK. Let's go with your patch then. Approved with a comment that shrink-wrapping can result in unreachable edges in the epilogue. I have bootstrapped/regtested Steven's patch now and committed to trunk: 2014-03-13 Steven Bosscher ste...@gcc.gnu.org PR rtl-optimization/57320 * function.c (rest_of_handle_thread_prologue_and_epilogue): Cleanup the CFG after thread_prologue_and_epilogue_insns. --- gcc/function.c.jj 2014-03-03 08:25:17.0 +0100 +++ gcc/function.c 2014-03-13 15:42:30.534922406 +0100 @@ -6991,6 +6991,10 @@ rest_of_handle_thread_prologue_and_epilo scheduling to operate in the epilogue. */ thread_prologue_and_epilogue_insns (); + /* Shrink-wrapping can result in unreachable edges in the epilogue, + see PR57320. */ + cleanup_cfg (0); + /* The stack usage info is finalized during prologue expansion. */ if (flag_stack_usage_info) output_stack_usage (); Jakub
Re: [Patch][google/main] Fix arm build broken
Thanks Richard, I'll remove UNSPEC_SIN/COS from my patch. Han On Thu, Mar 13, 2014 at 3:07 AM, Richard Earnshaw rearn...@arm.com wrote: On 12/03/14 22:35, Hán Shěn (沈涵) wrote: ARM build (on chrome) is broken because of duplicate entries in arm.md and unspecs.md. Fixed by removing duplication and merge those in arm.md into unspecs.md. (We had a similar fix for google/gcc-4_8 here - http://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=198650) Tested by building arm cross compiler successfully. Ok for google/main? Sounds to me like a merge botch. UNSPEC_SIN and UNSPEC_COS were removed from trunk some time back, when the old FPA code was removed. I very much doubt that you need to be re-adding them. R. Patch below - diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 8b269a4..9aec213 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -75,27 +75,6 @@ ] ) -;; UNSPEC Usage: -;; Note: sin and cos are no-longer used. -;; Unspec enumerators for Neon are defined in neon.md. - -(define_c_enum unspec [ - UNSPEC_SIN; `sin' operation (MODE_FLOAT): -; operand 0 is the result, -; operand 1 the parameter. - UNPSEC_COS; `cos' operation (MODE_FLOAT): -; operand 0 is the result, -; operand 1 the parameter. - UNSPEC_PROLOGUE_USE ; As USE insns are not meaningful after reload, -; this unspec is used to prevent the deletion of -; instructions setting registers for EH handling -; and stack frame generation. Operand 0 is the -; register to use. - UNSPEC_WMADDS ; Used by the intrinsic form of the iWMMXt WMADDS instruction. - UNSPEC_WMADDU ; Used by the intrinsic form of the iWMMXt WMADDU instruction. - UNSPEC_GOT_PREL_SYM ; Specify an R_ARM_GOT_PREL relocation of a symbol. -]) - ;; UNSPEC_VOLATILE Usage: diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md index 8caa953..89bc528 100644 --- a/gcc/config/arm/unspecs.md +++ b/gcc/config/arm/unspecs.md @@ -24,6 +24,12 @@ ;; Unspec enumerators for iwmmxt2 are defined in iwmmxt2.md (define_c_enum unspec [ + UNSPEC_SIN; `sin' operation (MODE_FLOAT): +; operand 0 is the result, +; operand 1 the parameter. + UNPSEC_COS; `cos' operation (MODE_FLOAT): +; operand 0 is the result, +; operand 1 the parameter. UNSPEC_PUSH_MULT ; `push multiple' operation: ; operand 0 is the first register, ; subsequent registers are in parallel (use ...) @@ -58,6 +64,7 @@ ; instruction stream. UNSPEC_PIC_OFFSET ; A symbolic 12-bit OFFSET that has been treated ; correctly for PIC usage. + UNSPEC_GOT_PREL_SYM ; Specify an R_ARM_GOT_PREL relocation of a symbol. UNSPEC_GOTSYM_OFF ; The offset of the start of the GOT from a ; a given symbolic address. UNSPEC_THUMB1_CASESI ; A Thumb1 compressed dispatch-table call. @@ -70,6 +77,11 @@ ; that. UNSPEC_UNALIGNED_STORE ; Same for str/strh. UNSPEC_PIC_UNIFIED; Create a common pic addressing form. + UNSPEC_PROLOGUE_USE ; As USE insns are not meaningful after reload, +; this unspec is used to prevent the deletion of +; instructions setting registers for EH handling +; and stack frame generation. Operand 0 is the +; register to use. UNSPEC_LL ; Represent an unpaired load-register-exclusive. UNSPEC_VRINTZ ; Represent a float to integral float rounding ; towards zero. @@ -87,6 +99,8 @@ (define_c_enum unspec [ UNSPEC_WADDC ; Used by the intrinsic form of the iWMMXt WADDC instruction. + UNSPEC_WMADDS ; Used by the intrinsic form of the iWMMXt WMADDS instruction. + UNSPEC_WMADDU ; Used by the intrinsic form of the iWMMXt WMADDU instruction. UNSPEC_WABS ; Used by the intrinsic form of the iWMMXt WABS instruction. UNSPEC_WQMULWMR ; Used by the intrinsic form of the iWMMXt WQMULWMR instruction. UNSPEC_WQMULMR ; Used by the intrinsic form of the iWMMXt WQMULMR instruction. Han -- Han Shen | Software Engineer | shen...@google.com | +1-650-440-3330
Re: [PATCH] Don't ICE with huge alignment (PR middle-end/60226)
Ping. On Tue, Mar 04, 2014 at 05:40:29PM +0100, Marek Polacek wrote: This should fix ICE on insane alignment. Normally, check_user_alignment detects e.g. alignment 1 32, but not 1 28. However, record_align is in bits, so it's actually 8 * (1 28) and that's greater than INT_MAX. This patch rejects such code. In the middle hunk, we should give up when an error occurs, we don't want to call finalize_type_size in that case -- we'd ICE in there. Regtested/bootstrapped on x86_64-linux, ok for trunk? 2014-03-04 Marek Polacek pola...@redhat.com PR middle-end/60226 * stor-layout.c (layout_type): Return if alignment of array elements is greater than element size. Error out if requested alignment is too large. cp/ * class.c (layout_class_type): Error out if requested alignment is too large. testsuite/ * c-c++-common/pr60226.c: New test. diff --git gcc/cp/class.c gcc/cp/class.c index b46391b..e6325b3 100644 --- gcc/cp/class.c +++ gcc/cp/class.c @@ -6378,6 +6378,14 @@ layout_class_type (tree t, tree *virtuals_p) if (TYPE_PACKED (t) !layout_pod_type_p (t)) rli-packed_maybe_necessary = true; + if (rli-record_align = (1U (HOST_BITS_PER_INT - 1))) +{ + TYPE_SIZE (rli-t) = integer_zero_node; + TYPE_SIZE_UNIT (rli-t) = integer_zero_node; + error (requested alignment is too large); + return; +} + /* Let the back end lay out the type. */ finish_record_layout (rli, /*free_p=*/true); diff --git gcc/stor-layout.c gcc/stor-layout.c index 084d195..445f0d5 100644 --- gcc/stor-layout.c +++ gcc/stor-layout.c @@ -2266,8 +2266,11 @@ layout_type (tree type) !TREE_OVERFLOW (TYPE_SIZE_UNIT (element)) !integer_zerop (TYPE_SIZE_UNIT (element)) compare_tree_int (TYPE_SIZE_UNIT (element), - TYPE_ALIGN_UNIT (element)) 0) - error (alignment of array elements is greater than element size); + TYPE_ALIGN_UNIT (element)) 0) + { + error (alignment of array elements is greater than element size); + return; + } break; } @@ -2294,6 +2297,14 @@ layout_type (tree type) if (TREE_CODE (type) == QUAL_UNION_TYPE) TYPE_FIELDS (type) = nreverse (TYPE_FIELDS (type)); + if (rli-record_align = (1U (HOST_BITS_PER_INT - 1))) + { + TYPE_SIZE (rli-t) = integer_zero_node; + TYPE_SIZE_UNIT (rli-t) = integer_zero_node; + error (requested alignment is too large); + return; + } + /* Finish laying out the record. */ finish_record_layout (rli, /*free_p=*/true); } diff --git gcc/testsuite/c-c++-common/pr60226.c gcc/testsuite/c-c++-common/pr60226.c index e69de29..0d7d74d 100644 --- gcc/testsuite/c-c++-common/pr60226.c +++ gcc/testsuite/c-c++-common/pr60226.c @@ -0,0 +1,12 @@ +/* PR c/60226 */ +/* { dg-do compile } */ +/* { dg-options -Wno-c++-compat { target c } } */ + +typedef int __attribute__ ((aligned (1 28))) int28; +int28 foo[4] = {}; /* { dg-error alignment of array elements is greater than element size } */ + +void +f (void) +{ + struct { __attribute__((aligned (1 28))) double a; } x; /* { dg-error requested alignment is too large } */ +} Marek Marek
Re: [PATCH] Fix overflows in get_ref_base_and_extent (PR tree-optimization/59779)
Hi Jakub, Is this change different from the one attached to PR? I have a bootstrap/regtest going with it. Dave On 3/13/2014 1:50 PM, Jakub Jelinek wrote: Hi! The outer-1.c testcase apparently fails on 32-bit HWI targets, the problem is that the int x[1][1]; array has bitsize that fits into uhwi, but not shwi, so we get negative maxsize that isn't -1. After discussions with Richard on IRC, I've implemented computation of bitsize and maxsize in double_int. Bootstrapped/regtested on x86_64-linux and i686-linux, with bootstrap/regtest time changes in the noise. Ok for trunk? John, could you please test this on some 32-bit HWI target with bootstrap/regtest? 2014-03-13 Jakub Jelinek ja...@redhat.com PR tree-optimization/59779 * tree-dfa.c (get_ref_base_and_extent): Use double_int type for bitsize and maxsize instead of HOST_WIDE_INT. --- gcc/tree-dfa.c.jj 2014-01-03 11:40:57.0 +0100 +++ gcc/tree-dfa.c 2014-03-13 12:10:46.367886640 +0100 @@ -389,11 +389,10 @@ get_ref_base_and_extent (tree exp, HOST_ HOST_WIDE_INT *psize, HOST_WIDE_INT *pmax_size) { - HOST_WIDE_INT bitsize = -1; - HOST_WIDE_INT maxsize = -1; + double_int bitsize = double_int_minus_one; + double_int maxsize; tree size_tree = NULL_TREE; double_int bit_offset = double_int_zero; - HOST_WIDE_INT hbit_offset; bool seen_variable_array_ref = false; /* First get the final access size from just the outermost expression. */ @@ -407,15 +406,11 @@ get_ref_base_and_extent (tree exp, HOST_ if (mode == BLKmode) size_tree = TYPE_SIZE (TREE_TYPE (exp)); else - bitsize = GET_MODE_BITSIZE (mode); -} - if (size_tree != NULL_TREE) -{ - if (! tree_fits_uhwi_p (size_tree)) - bitsize = -1; - else - bitsize = tree_to_uhwi (size_tree); + bitsize = double_int::from_uhwi (GET_MODE_BITSIZE (mode)); } + if (size_tree != NULL_TREE + TREE_CODE (size_tree) == INTEGER_CST) +bitsize = tree_to_double_int (size_tree); /* Initially, maxsize is the same as the accessed element size. In the following it will only grow (or become -1). */ @@ -448,7 +443,7 @@ get_ref_base_and_extent (tree exp, HOST_ referenced the last field of a struct or a union member then we have to adjust maxsize by the padding at the end of our field. */ - if (seen_variable_array_ref maxsize != -1) + if (seen_variable_array_ref !maxsize.is_minus_one ()) { tree stype = TREE_TYPE (TREE_OPERAND (exp, 0)); tree next = DECL_CHAIN (field); @@ -459,15 +454,22 @@ get_ref_base_and_extent (tree exp, HOST_ { tree fsize = DECL_SIZE_UNIT (field); tree ssize = TYPE_SIZE_UNIT (stype); - if (tree_fits_shwi_p (fsize) -tree_fits_shwi_p (ssize) -doffset.fits_shwi ()) - maxsize += ((tree_to_shwi (ssize) - - tree_to_shwi (fsize)) - * BITS_PER_UNIT - - doffset.to_shwi ()); + if (fsize == NULL + || TREE_CODE (fsize) != INTEGER_CST + || ssize == NULL + || TREE_CODE (ssize) != INTEGER_CST) + maxsize = double_int_minus_one; else - maxsize = -1; + { + double_int tem = tree_to_double_int (ssize) +- tree_to_double_int (fsize); + if (BITS_PER_UNIT == 8) + tem = tem.lshift (3); + else + tem *= double_int::from_uhwi (BITS_PER_UNIT); + tem -= doffset; + maxsize += tem; + } } } } @@ -477,13 +479,12 @@ get_ref_base_and_extent (tree exp, HOST_ /* We need to adjust maxsize to the whole structure bitsize. But we can subtract any constant offset seen so far, because that would get us out of the structure otherwise. */ - if (maxsize != -1 + if (!maxsize.is_minus_one () csize -tree_fits_uhwi_p (csize) -bit_offset.fits_shwi ()) - maxsize = tree_to_uhwi (csize) - bit_offset.to_shwi (); +TREE_CODE (csize) == INTEGER_CST) + maxsize = tree_to_double_int (csize) - bit_offset; else -
Re: [PATCH] Fix overflows in get_ref_base_and_extent (PR tree-optimization/59779)
On Thu, Mar 13, 2014 at 02:13:39PM -0400, John David Anglin wrote: Is this change different from the one attached to PR? I have a bootstrap/regtest going with it. Yes, the one attached to the PR had major bugs, in two spots replaced if (maxsize == -1 with if (!maxsize.is_minus_one () rather than if (maxsize.is_minus_one () It passed bootstrap/regtest on i686-linux, but contains tons of testsuite regressions. Sorry for that. Jakub
Re: [PATCH, libiberty]: Avoid 'right-hand operand of comma expression has no effect' when compiling regex.c
On Thu, Mar 13, 2014 at 6:30 PM, Ian Lance Taylor i...@google.com wrote: On Thu, Mar 13, 2014 at 3:36 AM, Uros Bizjak ubiz...@gmail.com wrote: Attached patch changes the return value of the bzero macro to void, as defined in a 4.3BSD: void bzero(void *s, size_t n); As an additional benefit, the changed macro now generates warning when its return value is used (which is *not* the case in regex.c): I'm not worried about anybody using the return value incorrectly in this file. I think we should just # define bzero(s, n) memset (s, '\0', n) I'll approve that change if it works. Attached patch compiles without warnings as well. However, in some case, we have similar situation with unused return value of # define memcpy(d, s, n)(bcopy (s, d, n), (d)) so, I put (void) casts to memcpy call to avoid eventual right-hand operand of comma expression has no effect warnings there. 2014-03-13 Uros Bizjak ubiz...@gmail.com * regex.c (bzero) [!_LIBC]: Define without coma expression. (regerror): Cast the call to memcpy to (void) to avoid unused value warnings. Is this version acceptable for mainline? Uros. Index: ChangeLog === --- ChangeLog (revision 208550) +++ ChangeLog (working copy) @@ -1,3 +1,9 @@ +2014-03-13 Uros Bizjak ubiz...@gmail.com + + * regex.c (bzero) [!_LIBC]: Define without coma expression. + (regerror): Cast the call to memcpy to (void) to avoid unused + value warnings. + 2014-01-28 Thomas Schwinge tho...@codesourcery.com * cp-demangle.c (d_demangle_callback): Put an abort call in place, Index: regex.c === --- regex.c (revision 208550) +++ regex.c (working copy) @@ -151,7 +151,7 @@ char *realloc (); #include string.h #ifndef bzero # ifndef _LIBC -# define bzero(s, n) (memset (s, '\0', n), (s)) +# define bzero(s, n) memset (s, '\0', n) # else # define bzero(s, n) __bzero (s, n) # endif @@ -8093,12 +8093,12 @@ regerror (int errcode, const regex_t *preg ATTRIBU #if defined HAVE_MEMPCPY || defined _LIBC *((char *) mempcpy (errbuf, msg, errbuf_size - 1)) = '\0'; #else - memcpy (errbuf, msg, errbuf_size - 1); + (void) memcpy (errbuf, msg, errbuf_size - 1); errbuf[errbuf_size - 1] = 0; #endif } else -memcpy (errbuf, msg, msg_size); +(void) memcpy (errbuf, msg, msg_size); } return msg_size;
Re: [PATCH, libiberty]: Avoid 'right-hand operand of comma expression has no effect' when compiling regex.c
On Thu, Mar 13, 2014 at 11:24 AM, Uros Bizjak ubiz...@gmail.com wrote: On Thu, Mar 13, 2014 at 6:30 PM, Ian Lance Taylor i...@google.com wrote: On Thu, Mar 13, 2014 at 3:36 AM, Uros Bizjak ubiz...@gmail.com wrote: Attached patch changes the return value of the bzero macro to void, as defined in a 4.3BSD: void bzero(void *s, size_t n); As an additional benefit, the changed macro now generates warning when its return value is used (which is *not* the case in regex.c): I'm not worried about anybody using the return value incorrectly in this file. I think we should just # define bzero(s, n) memset (s, '\0', n) I'll approve that change if it works. Attached patch compiles without warnings as well. However, in some case, we have similar situation with unused return value of # define memcpy(d, s, n)(bcopy (s, d, n), (d)) so, I put (void) casts to memcpy call to avoid eventual right-hand operand of comma expression has no effect warnings there. 2014-03-13 Uros Bizjak ubiz...@gmail.com * regex.c (bzero) [!_LIBC]: Define without coma expression. (regerror): Cast the call to memcpy to (void) to avoid unused value warnings. Is this version acceptable for mainline? This is OK. Thanks. Ian
Re: [patch,avr] Fix PR59396: Ignore leading '*' in warning generation for ISR names
2014-03-13 21:41 GMT+04:00 Georg-Johann Lay a...@gjlay.de: Am 03/13/2014 04:41 PM, schrieb Senthil Kumar Selvaraj: On Thu, Mar 13, 2014 at 02:24:06PM +0100, Georg-Johann Lay wrote: Problem is that the assembler name might or might not be prefixed by '*' depending on when TARGET_SET_CURRENT_FUNCTION is called. The change is just to fix wrong warning because the current implementation of TARGET_SET_CURRENT_FUNCTION /always/ skips the first char when the assembler name is set. FWIW, there's default_strip_name_encoding (varasm.c), which does the same thing, and is used by a couple of other targets. Yes, I know. But I would prefer targetm.strip_name_encoding then, even though avr does not implement it. I'm prefer `targetm.strip_name_encoding' or `default_strip_name_encoding'. May be `default_strip_name_encoding' is better because it's used in few ports. Denis.
Re: [PATCH] Fix overflows in get_ref_base_and_extent (PR tree-optimization/59779)
On March 13, 2014 6:50:53 PM CET, Jakub Jelinek ja...@redhat.com wrote: Hi! The outer-1.c testcase apparently fails on 32-bit HWI targets, the problem is that the int x[1][1]; array has bitsize that fits into uhwi, but not shwi, so we get negative maxsize that isn't -1. After discussions with Richard on IRC, I've implemented computation of bitsize and maxsize in double_int. Bootstrapped/regtested on x86_64-linux and i686-linux, with bootstrap/regtest time changes in the noise. Ok for trunk? OK. Thanks, Richard. John, could you please test this on some 32-bit HWI target with bootstrap/regtest? 2014-03-13 Jakub Jelinek ja...@redhat.com PR tree-optimization/59779 * tree-dfa.c (get_ref_base_and_extent): Use double_int type for bitsize and maxsize instead of HOST_WIDE_INT. --- gcc/tree-dfa.c.jj 2014-01-03 11:40:57.0 +0100 +++ gcc/tree-dfa.c 2014-03-13 12:10:46.367886640 +0100 @@ -389,11 +389,10 @@ get_ref_base_and_extent (tree exp, HOST_ HOST_WIDE_INT *psize, HOST_WIDE_INT *pmax_size) { - HOST_WIDE_INT bitsize = -1; - HOST_WIDE_INT maxsize = -1; + double_int bitsize = double_int_minus_one; + double_int maxsize; tree size_tree = NULL_TREE; double_int bit_offset = double_int_zero; - HOST_WIDE_INT hbit_offset; bool seen_variable_array_ref = false; /* First get the final access size from just the outermost expression. */ @@ -407,15 +406,11 @@ get_ref_base_and_extent (tree exp, HOST_ if (mode == BLKmode) size_tree = TYPE_SIZE (TREE_TYPE (exp)); else - bitsize = GET_MODE_BITSIZE (mode); -} - if (size_tree != NULL_TREE) -{ - if (! tree_fits_uhwi_p (size_tree)) - bitsize = -1; - else - bitsize = tree_to_uhwi (size_tree); + bitsize = double_int::from_uhwi (GET_MODE_BITSIZE (mode)); } + if (size_tree != NULL_TREE + TREE_CODE (size_tree) == INTEGER_CST) +bitsize = tree_to_double_int (size_tree); /* Initially, maxsize is the same as the accessed element size. In the following it will only grow (or become -1). */ @@ -448,7 +443,7 @@ get_ref_base_and_extent (tree exp, HOST_ referenced the last field of a struct or a union member then we have to adjust maxsize by the padding at the end of our field. */ - if (seen_variable_array_ref maxsize != -1) + if (seen_variable_array_ref !maxsize.is_minus_one ()) { tree stype = TREE_TYPE (TREE_OPERAND (exp, 0)); tree next = DECL_CHAIN (field); @@ -459,15 +454,22 @@ get_ref_base_and_extent (tree exp, HOST_ { tree fsize = DECL_SIZE_UNIT (field); tree ssize = TYPE_SIZE_UNIT (stype); - if (tree_fits_shwi_p (fsize) - tree_fits_shwi_p (ssize) - doffset.fits_shwi ()) -maxsize += ((tree_to_shwi (ssize) - - tree_to_shwi (fsize)) -* BITS_PER_UNIT - - doffset.to_shwi ()); + if (fsize == NULL + || TREE_CODE (fsize) != INTEGER_CST + || ssize == NULL + || TREE_CODE (ssize) != INTEGER_CST) +maxsize = double_int_minus_one; else -maxsize = -1; +{ + double_int tem = tree_to_double_int (ssize) + - tree_to_double_int (fsize); + if (BITS_PER_UNIT == 8) +tem = tem.lshift (3); + else +tem *= double_int::from_uhwi (BITS_PER_UNIT); + tem -= doffset; + maxsize += tem; +} } } } @@ -477,13 +479,12 @@ get_ref_base_and_extent (tree exp, HOST_ /* We need to adjust maxsize to the whole structure bitsize. But we can subtract any constant offset seen so far, because that would get us out of the structure otherwise. */ - if (maxsize != -1 + if (!maxsize.is_minus_one () csize - tree_fits_uhwi_p (csize) - bit_offset.fits_shwi ()) -maxsize = tree_to_uhwi (csize) - bit_offset.to_shwi (); + TREE_CODE (csize) == INTEGER_CST) +maxsize = tree_to_double_int (csize) - bit_offset; else -maxsize = -1; +maxsize = double_int_minus_one; } } break; @@ -520,13 +521,12 @@ get_ref_base_and_extent
Re: [patch,avr] Fix PR59396, take 2: Ignore leading '*' in warning generation for ISR names
Am 03/13/2014 07:36 PM, schrieb Denis Chertykov: 2014-03-13 21:41 GMT+04:00 Georg-Johann Lay: Am 03/13/2014 04:41 PM, schrieb Senthil Kumar Selvaraj: On Thu, Mar 13, 2014 at 02:24:06PM +0100, Georg-Johann Lay wrote: Problem is that the assembler name might or might not be prefixed by '*' depending on when TARGET_SET_CURRENT_FUNCTION is called. The change is just to fix wrong warning because the current implementation of TARGET_SET_CURRENT_FUNCTION /always/ skips the first char when the assembler name is set. FWIW, there's default_strip_name_encoding (varasm.c), which does the same thing, and is used by a couple of other targets. Yes, I know. But I would prefer targetm.strip_name_encoding then, even though avr does not implement it. I'm prefer `targetm.strip_name_encoding' or `default_strip_name_encoding'. May be `default_strip_name_encoding' is better because it's used in few ports. Denis. So here is the revised version of the patch. Johann PR target/59396 * config/avr/avr.c (avr_set_current_function): Pass function name through default_strip_name_encoding before sanity checking instead of skipping the first char of the assembler name. Index: config/avr/avr.c === --- config/avr/avr.c (revision 208532) +++ config/avr/avr.c (working copy) @@ -600,10 +600,14 @@ avr_set_current_function (tree decl) const char *name; name = DECL_ASSEMBLER_NAME_SET_P (decl) -/* Remove the leading '*' added in set_user_assembler_name. */ -? 1 + IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)) +? IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)) : IDENTIFIER_POINTER (DECL_NAME (decl)); + /* Skip a leading '*' that might still prefix the assembler name, + e.g. in non-LTO runs. */ + + name = default_strip_name_encoding (name); + /* Silently ignore 'signal' if 'interrupt' is present. AVR-LibC startet using this when it switched from SIGNAL and INTERRUPT to ISR. */
[COMMITTED] Fix debug/60438 -- i686 stack vs fp operations
The original ICE is caused by the dwarf2cfi pass not noticing a stack adjustment in the insn stream. The reason for the miss is that the push/pop was added by a post-reload splitter, which did nothing to mark the insn for special treatment. In the PR, Jakub and I threw around several ideas. My first attempt, to annotate the stack adjustment so that dwarf2cfi could see it, showed a lack of supporting infrastructure within the csa and dwarf2cfi passes. In order to make that path work in the short term, csa had to be crippled. My second attempt removes ix86_force_to_memory and all uses thereof. Thus there are no longer any troublesome post-reload stack manipulations and dwarf2cfi doesn't get confused. In the case of int-float conversions, we can figure out at rtl-expand time that we might need a bit o stack memory, and we can allocate it via assign_386_stack_local. This produces minimal churn in this area, though we still get to remove some patterns that didn't have the scratch memory. In the other cases the patterns are created by combine, and there we have no chance to use assign_386_stack_local. Here, I simply remove the register alternatives, leaving the register allocator no choice but to force the value into memory. If I recall correctly, the old reload would barf on this (thus the byzantine structure of the existing patterns). But Vlad assured me that LRA will handle this just fine. Considering that we mostly will never choose the combined int-convert-and-operate patterns (-Os or ancient cpu tunings only), I think this is the best of the available options. For stage1, it would be interesting to investigate whether we can eliminate the assign_386_stack_local fiddly bits from int-float conversions, and simply rely on LRA there as well. It would certainly reduce some code duplication. Indeed, with clever use of enabled perhaps we can reduce to a single pattern. Tested on x86_64 and i686-linux. r~ PR debug/60438 * config/i386/i386.c (ix86_split_fp_branch): Remove pushed argument. (ix86_force_to_memory, ix86_free_from_memory): Remove. * config/i386/i386-protos.h: Likewise. * config/i386/i386.md (floathiX87MODEF2): Use assign_386_stack_local in the expander instead of a splitter. (floatSWI48xX87MODEF2): Use assign_386_stack_local if there is any possibility of requiring a memory. (*floatsiMODEF2_vector_mixed): Remove, and the splitters. (*floatsiMODEF2_vector_sse): Remove, and the splitters. (fp branch splitters): Update for ix86_split_fp_branch. (*jccX87MODEF_SWI24_i387): Remove r/f alternative. (*jccX87MODEF_SWI24_r_i387): Likewise. (splitter for jccX87MODEF_SWI24_i387 r/f): Remove. (*fop_MODEF_2_i387): Remove f/r alternative. (*fop_MODEF_3_i387): Likewise. (*fop_xf_2_i387, *fop_xf_3_i387): Likewise. (splitters for the fop_* register patterns): Remove. (fscalexf4_i387): Rename from *fscalexf4_i387. (ldexpxf3): Use gen_floatsixf2 and gen_fscalexf4_i387. diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index 3493904..6e32978 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -154,13 +154,11 @@ extern enum machine_mode ix86_fp_compare_mode (enum rtx_code); extern rtx ix86_libcall_value (enum machine_mode); extern bool ix86_function_arg_regno_p (int); extern void ix86_asm_output_function_label (FILE *, const char *, tree); -extern rtx ix86_force_to_memory (enum machine_mode, rtx); -extern void ix86_free_from_memory (enum machine_mode); extern void ix86_call_abi_override (const_tree); extern int ix86_reg_parm_stack_space (const_tree); extern void ix86_split_fp_branch (enum rtx_code code, rtx, rtx, - rtx, rtx, rtx, rtx); + rtx, rtx, rtx); extern bool ix86_hard_regno_mode_ok (int, enum machine_mode); extern bool ix86_modes_tieable_p (enum machine_mode, enum machine_mode); extern bool ix86_secondary_memory_needed (enum reg_class, enum reg_class, diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 9e33d53..64b8e0a 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -19993,7 +19993,7 @@ ix86_expand_branch (enum rtx_code code, rtx op0, rtx op1, rtx label) /* Split branch based on floating point condition. */ void ix86_split_fp_branch (enum rtx_code code, rtx op1, rtx op2, - rtx target1, rtx target2, rtx tmp, rtx pushed) + rtx target1, rtx target2, rtx tmp) { rtx condition; rtx i; @@ -20009,10 +20009,6 @@ ix86_split_fp_branch (enum rtx_code code, rtx op1, rtx op2, condition = ix86_expand_fp_compare (code, op1, op2, tmp); - /* Remove pushed operand from stack. */ - if (pushed) -ix86_free_from_memory (GET_MODE (pushed)); - i = emit_jump_insn (gen_rtx_SET
Re: [PATCH] Fix up #pragma weak handling (PR middle-end/36282)
On Thu, 13 Mar 2014, Jakub Jelinek wrote: 2014-03-13 Jakub Jelinek ja...@redhat.com PR middle-end/36282 * c-pragma.c (apply_pragma_weak): Only look at TREE_SYMBOL_REFERENCED (DECL_ASSEMBLER_NAME (decl)) if DECL_ASSEMBLER_NAME_SET_P (decl). (maybe_apply_pending_pragma_weaks): Exit early if vec_safe_is_empty (pending_weaks) rather than only when !pending_weaks. (maybe_apply_pragma_weak): Likewise. If !DECL_ASSEMBLER_NAME_SET_P, set assembler name back to NULL afterwards. * c-c++-common/pr36282-1.c: New test. * c-c++-common/pr36282-2.c: New test. * c-c++-common/pr36282-3.c: New test. * c-c++-common/pr36282-4.c: New test. OK. -- Joseph S. Myers jos...@codesourcery.com
Re: patch fortran, pr 59746, internal compiler error : segmentation fault
Hello, Le 10/03/2014 03:15, jimmie.da...@l-3com.com a écrit : Index: gcc/gcc/fortran/symbol.c === --- gcc/gcc/fortran/symbol.c (revision 208437) +++ gcc/gcc/fortran/symbol.c (working copy) @@ -3069,56 +3069,56 @@ FOR_EACH_VEC_ELT (latest_undo_chgset-syms, i, p) { - if (p-gfc_new) + /* Symbol was new. Or was old and just put in common. */ Now the comment needs updating as just put in common also applies to the new case. Or you can also remove it (just put in common is somewhat redundant with the other comment anyway). + if ( p-attr.in_common p-common_block p-common_block-head + (p-gfc_new || !p-old_symbol-attr.in_common)) { - /* Symbol was new. */ - if (p-attr.in_common p-common_block p-common_block-head) - { - /* If the symbol was added to any common block, it - needs to be removed to stop the resolver looking - for a (possibly) dead symbol. */ + /* If the symbol was added to any common block, it + needs to be removed to stop the resolver looking + for a (possibly) dead symbol. */ needs should be aligned with If like it was before; same for for. Now we are in pretty good shape. The ICE happens with invalid code after reporting an error, correct? Then I agree, this should rather wait for stage 1. Thanks Mikael
Re: [PATCH, libiberty]: Avoid 'right-hand operand of comma expression has no effect' when compiling regex.c
On 03/13/2014 10:36 AM, Uros Bizjak wrote: +# define bzero(s, n)(memset (s, '\0', n), (void) 0) AFAICS, the comma operator was only needed because of the intention to return 's'. If 's' is not be returned, then simply # define bzero(s, n) ((void) memset (s, '\0', n)) should work. -- Pedro Alves
Re: [PATCH, libiberty]: Avoid 'right-hand operand of comma expression has no effect' when compiling regex.c
On Thu, Mar 13, 2014 at 10:24 PM, Pedro Alves pal...@redhat.com wrote: On 03/13/2014 10:36 AM, Uros Bizjak wrote: +# define bzero(s, n)(memset (s, '\0', n), (void) 0) AFAICS, the comma operator was only needed because of the intention to return 's'. If 's' is not be returned, then simply # define bzero(s, n) ((void) memset (s, '\0', n)) should work. I think that adding (void) is the best solution. I'll commit this version as soon as bootstrap ends. Thanks, Uros.
[PATCH] Add test for PR c++/53711
This patch adds a test case for PR c++/53711 which seems to have been resolved by r199906. PR c++/53711 * d++.dg/warn/anonymous-namespace-6.C: New test. --- gcc/testsuite/g++.dg/warn/anonymous-namespace-6.C | 8 1 file changed, 8 insertions(+) create mode 100644 gcc/testsuite/g++.dg/warn/anonymous-namespace-6.C diff --git a/gcc/testsuite/g++.dg/warn/anonymous-namespace-6.C b/gcc/testsuite/g++.dg/warn/anonymous-namespace-6.C new file mode 100644 index 000..d238df3 --- /dev/null +++ b/gcc/testsuite/g++.dg/warn/anonymous-namespace-6.C @@ -0,0 +1,8 @@ +// PR c++/53711 +// { dg-options -Wall } + +namespace { + void f () // { dg-warning not used } + { + } +} -- 1.9.0
[PATCH]: Revise gcse.c to handle parallels TRAP_IF and other RTL codes not handled by single_set
This patch fixes PR rtl-optimization/60155. The PA backend has a number of INSN patterns which trap on signed overflow. These are implemented as parallels using the trap_if code. Currently, single_set does not consider a parallel with a trap_if rtx to be a single set. This causes an ICE in gcse.c when an insn with a trap_if is encountered. The problem is fixed by implementing a gcse specific version of single_set which only looks at whether there is a single non-dead set in an insn pattern. It allows multiple other sets if they are dead. Tested on hppa-unknown-linux-gnu, hppa2.0w-hp-hpux11.11 and hppa64-hp- hpux11.11. OK for trunk? Dave -- John David Anglin dave.ang...@bell.net 2014-03-13 John David Anglin dang...@gcc.gnu.org PR rtl-optimization/60155 * gcse.c (record_set_data): New function. (single_set_gcse): New function. (gcse_emit_move_after): Use single_set_gcse instead of single_set. (hoist_code): Likewise. (get_pressure_class_and_nregs): Likewise. Index: gcse.c === --- gcse.c (revision 208442) +++ gcse.c (working copy) @@ -2502,6 +2502,57 @@ } } +struct set_data +{ + rtx insn; + const_rtx set; + int nsets; +}; + +/* Increment number of sets and record set in DATA. */ + +static void +record_set_data (rtx dest, const_rtx set, void *data) +{ + struct set_data *s = (struct set_data *)data; + + if (GET_CODE (set) == SET) +{ + /* We allow insns having multiple sets, where all but one are +dead as single set insns. In the common case only a single +set is present, so we want to avoid checking for REG_UNUSED +notes unless necessary. */ + if (s-nsets == 1 + find_reg_note (s-insn, REG_UNUSED, SET_DEST (s-set)) + !side_effects_p (s-set)) + s-nsets = 0; + + if (!s-nsets) + { + /* Record this set. */ + s-nsets += 1; + s-set = set; + } + else if (!find_reg_note (s-insn, REG_UNUSED, dest) + || side_effects_p (set)) + s-nsets += 1; +} +} + +static const_rtx +single_set_gcse (rtx insn) +{ + struct set_data s; + + s.insn = insn; + s.nsets = 0; + note_stores (PATTERN (insn), record_set_data, s); + + /* Considered invariant insns have exactly one set. */ + gcc_assert (s.nsets == 1); + return s.set; +} + /* Emit move from SRC to DEST noting the equivalence with expression computed in INSN. */ @@ -2509,7 +2560,8 @@ gcse_emit_move_after (rtx dest, rtx src, rtx insn) { rtx new_rtx; - rtx set = single_set (insn), set2; + const_rtx set = single_set_gcse (insn); + rtx set2; rtx note; rtx eqv = NULL_RTX; @@ -3369,13 +3421,12 @@ FOR_EACH_VEC_ELT (occrs_to_hoist, j, occr) { rtx insn; - rtx set; + const_rtx set; gcc_assert (!occr-deleted_p); insn = occr-insn; - set = single_set (insn); - gcc_assert (set); + set = single_set_gcse (insn); /* Create a pseudo-reg to store the result of reaching expressions into. Get the mode for the new pseudo @@ -3456,10 +3507,8 @@ { rtx reg; enum reg_class pressure_class; - rtx set = single_set (insn); + const_rtx set = single_set_gcse (insn); - /* Considered invariant insns have only one set. */ - gcc_assert (set != NULL_RTX); reg = SET_DEST (set); if (GET_CODE (reg) == SUBREG) reg = SUBREG_REG (reg);
Re: [PATCH] Fix PR60505
On Thu, Mar 13, 2014 at 2:27 AM, Richard Biener rguent...@suse.de wrote: On Wed, 12 Mar 2014, Cong Hou wrote: Thank you for pointing it out. I didn't realized that alias analysis has influences on this issue. The current problem is that the epilogue may be unnecessary if the loop bound cannot be larger than the number of iterations of the vectorized loop multiplied by VF when the vectorized loop is supposed to be executed. My method is incorrect because I assume the vectorized loop will be executed which is actually guaranteed by loop bound check (and also alias checks). So if the alias checks exist, my method is fine as both conditions are met. But there is still the loop bound check which, if it fails, uses the epilogue loop as fallback, not the scalar versioned loop. The loop bound check is already performed together with alias checks (assume we need alias checks). Actually, I did observe that the loop bound check in the true body of alias checks may be unnecessary. For example, for the following loop for(i=0; i num ; ++i) out[i] = (ovec[i] = in[i]); GCC now generates the following GIMPLE code after vectorization: bb 3: // loop bound check (with cost model) and alias checks _29 = (unsigned int) num_5(D); _28 = _29 15; _24 = in_9(D) + 16; _23 = out_7(D) = _24; _2 = out_7(D) + 16; _1 = _2 = in_9(D); _32 = _1 | _23; _31 = _28 _32; if (_31 != 0) goto bb 4; else goto bb 12; bb 4: niters.3_44 = (unsigned int) num_5(D); _46 = niters.3_44 + 4294967280; _47 = _46 4; bnd.4_45 = _47 + 1; ratio_mult_vf.5_48 = bnd.4_45 4; _59 = (unsigned int) num_5(D); _60 = _59 + 4294967295; if (_60 = 14) is this necessary? goto bb 10; else goto bb 5; The check _60=14 should be unnecessary because it is implied by the fact _29 15 in bb3. Consider this fact and if there are alias checks, we can safely remove the epilogue if the maximum trip count of the loop is less than or equal to the calculated threshold. Cong If there is no alias checks, I must consider the possibility that the vectorized loop may not be executed at runtime and then the epilogue should not be eliminated. The warning appears on epilogue, and with loop bound checks (and without alias checks) the warning will be gone. So I think the key is alias checks: my method only works if there is no alias checks. How about adding one more condition that checks if alias checks are needed, as the code shown below? else if (LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) || (tree_ctz (LOOP_VINFO_NITERS (loop_vinfo)) (unsigned)exact_log2 (LOOP_VINFO_VECT_FACTOR (loop_vinfo)) (!LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo) || (unsigned HOST_WIDE_INT)max_stmt_executions_int (LOOP_VINFO_LOOP (loop_vinfo)) (unsigned)th))) LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo) = true; thanks, Cong On Wed, Mar 12, 2014 at 1:24 AM, Jakub Jelinek ja...@redhat.com wrote: On Tue, Mar 11, 2014 at 04:16:13PM -0700, Cong Hou wrote: This patch is fixing PR60505 in which the vectorizer may produce unnecessary epilogues. Bootstrapped and tested on a x86_64 machine. OK for trunk? That looks wrong. Consider the case where the loop isn't versioned, if you disable generation of the epilogue loop, you end up only with a vector loop. Say: unsigned char ovec[16] __attribute__((aligned (16))) = { 0 }; void foo (char *__restrict in, char *__restrict out, int num) { int i; in = __builtin_assume_aligned (in, 16); out = __builtin_assume_aligned (out, 16); for (i = 0; i num; ++i) out[i] = (ovec[i] = in[i]); out[num] = ovec[num / 2]; } -O2 -ftree-vectorize. Now, consider if this function is called with num != 16 (num 16 is of course invalid, but num 0 to 15 is valid and your patch will cause a wrong-code in this case). Jakub -- Richard Biener rguent...@suse.de SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer
[jit] Add type-checking to gcc_jit_context_new_cast
Committed to branch dmalcolm/jit: gcc/jit/ * libgccjit.c (is_valid_cast): New. (gcc_jit_context_new_cast): Check for compatible types. * internal-api.c (gcc::jit::recording::memento_of_get_type:: is_int): New. (gcc::jit::recording::memento_of_get_type::is_float): New. (gcc::jit::recording::memento_of_get_type::is_bool): New. * internal-api.h (gcc::jit::recording::type::is_int): New. (gcc::jit::recording::type::is_float): New. (gcc::jit::recording::type::is_bool): New. (gcc::jit::recording::memento_of_get_type::is_int): New. (gcc::jit::recording::memento_of_get_type::is_float): New. (gcc::jit::recording::memento_of_get_type::is_bool): New. (gcc::jit::recording::memento_of_get_pointer::is_int): New. (gcc::jit::recording::memento_of_get_pointer::is_float): New. (gcc::jit::recording::memento_of_get_pointer::is_bool): New. (gcc::jit::recording::memento_of_get_const::is_int): New. (gcc::jit::recording::memento_of_get_const::is_float): New. (gcc::jit::recording::memento_of_get_const::is_bool): New. (gcc::jit::recording::memento_of_get_volatile::is_int): New. (gcc::jit::recording::memento_of_get_volatile::is_float): New. (gcc::jit::recording::memento_of_get_volatile::is_bool): New. (gcc::jit::recording::array_type::is_int): New. (gcc::jit::recording::array_type::is_float): New. (gcc::jit::recording::array_type::is_bool): New. (gcc::jit::recording::function_type::is_int): New. (gcc::jit::recording::function_type::is_float): New. (gcc::jit::recording::function_type::is_bool): New. (gcc::jit::recording::struct_::is_int): New. (gcc::jit::recording::struct_::is_float): New. (gcc::jit::recording::struct_::is_bool): New. gcc/testsuite/ * jit.dg/test-error-bad-cast.c: New test case. --- gcc/jit/ChangeLog.jit | 42 + gcc/jit/internal-api.c | 135 + gcc/jit/internal-api.h | 33 +++ gcc/jit/libgccjit.c| 33 +++ gcc/testsuite/ChangeLog.jit| 4 + gcc/testsuite/jit.dg/test-error-bad-cast.c | 63 ++ 6 files changed, 310 insertions(+) create mode 100644 gcc/testsuite/jit.dg/test-error-bad-cast.c diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit index 87f10a3..260273c 100644 --- a/gcc/jit/ChangeLog.jit +++ b/gcc/jit/ChangeLog.jit @@ -1,5 +1,47 @@ 2014-03-13 David Malcolm dmalc...@redhat.com + * libgccjit.c (is_valid_cast): New. + (gcc_jit_context_new_cast): Check for compatible types. + + * internal-api.c (gcc::jit::recording::memento_of_get_type:: + is_int): New. + (gcc::jit::recording::memento_of_get_type::is_float): New. + (gcc::jit::recording::memento_of_get_type::is_bool): New. + + * internal-api.h (gcc::jit::recording::type::is_int): New. + (gcc::jit::recording::type::is_float): New. + (gcc::jit::recording::type::is_bool): New. + + (gcc::jit::recording::memento_of_get_type::is_int): New. + (gcc::jit::recording::memento_of_get_type::is_float): New. + (gcc::jit::recording::memento_of_get_type::is_bool): New. + + (gcc::jit::recording::memento_of_get_pointer::is_int): New. + (gcc::jit::recording::memento_of_get_pointer::is_float): New. + (gcc::jit::recording::memento_of_get_pointer::is_bool): New. + + (gcc::jit::recording::memento_of_get_const::is_int): New. + (gcc::jit::recording::memento_of_get_const::is_float): New. + (gcc::jit::recording::memento_of_get_const::is_bool): New. + + (gcc::jit::recording::memento_of_get_volatile::is_int): New. + (gcc::jit::recording::memento_of_get_volatile::is_float): New. + (gcc::jit::recording::memento_of_get_volatile::is_bool): New. + + (gcc::jit::recording::array_type::is_int): New. + (gcc::jit::recording::array_type::is_float): New. + (gcc::jit::recording::array_type::is_bool): New. + + (gcc::jit::recording::function_type::is_int): New. + (gcc::jit::recording::function_type::is_float): New. + (gcc::jit::recording::function_type::is_bool): New. + + (gcc::jit::recording::struct_::is_int): New. + (gcc::jit::recording::struct_::is_float): New. + (gcc::jit::recording::struct_::is_bool): New. + +2014-03-13 David Malcolm dmalc...@redhat.com + * internal-api.c (gcc::jit::recording::context::set_str_option): Provide NULL recording::location to add_error. (gcc::jit::recording::context::set_int_option): Likewise. diff --git a/gcc/jit/internal-api.c b/gcc/jit/internal-api.c index 692dffb..062095e 100644 --- a/gcc/jit/internal-api.c +++ b/gcc/jit/internal-api.c @@ -856,6 +856,141 @@ recording::memento_of_get_type::dereference () } } +bool
Re: libgo patch committed: Compile math library with -ffp-contract=off
Ian Lance Taylor i...@google.com writes: The bug report http://golang.org/issue/7074 shows that math.Log2(1) produces the wrong result on Aarch64, because the Go math package is compiled to use a fused multiply-add instruction. This patch to the libgo configure script will use -ffp-contract=off when compiling the math package on processors other than x86. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu, not that that tests much. Committed to mainline. Thanks for this! If you are willing to go into battle enough to argue that libgcc should also be compiled with -ffp-contract=off (I did not have the stomach for that fight) then we'll be down to 1 check-go failure on aarch64 (which is peano -- due to the absence of split/copyable stacks and should probably xfail). Cheers, mwh Ian diff -r 76dbb6f77e3d libgo/configure.ac --- a/libgo/configure.ac Tue Mar 11 12:53:06 2014 -0700 +++ b/libgo/configure.ac Tue Mar 11 21:26:35 2014 -0700 @@ -620,6 +620,8 @@ MATH_FLAG= if test $libgo_cv_c_fancymath = yes; then MATH_FLAG=-mfancy-math-387 -funsafe-math-optimizations +else + MATH_FLAG=-ffp-contract=off fi AC_SUBST(MATH_FLAG)
Re: libgo patch committed: Compile math library with -ffp-contract=off
On Thu, Mar 13, 2014 at 6:27 PM, Michael Hudson-Doyle michael.hud...@linaro.org wrote: Ian Lance Taylor i...@google.com writes: The bug report http://golang.org/issue/7074 shows that math.Log2(1) produces the wrong result on Aarch64, because the Go math package is compiled to use a fused multiply-add instruction. This patch to the libgo configure script will use -ffp-contract=off when compiling the math package on processors other than x86. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu, not that that tests much. Committed to mainline. Thanks for this! If you are willing to go into battle enough to argue that libgcc should also be compiled with -ffp-contract=off (I did not have the stomach for that fight) then we'll be down to 1 check-go failure on aarch64 (which is peano -- due to the absence of split/copyable stacks and should probably xfail). Hmmm, what is it that fails with libgcc? Is there a bug report for it? I agree that peano is likely to fail without split stacks. Ian
Re: libgo patch committed: Compile math library with -ffp-contract=off
Ian Lance Taylor i...@google.com writes: On Thu, Mar 13, 2014 at 6:27 PM, Michael Hudson-Doyle michael.hud...@linaro.org wrote: Ian Lance Taylor i...@google.com writes: The bug report http://golang.org/issue/7074 shows that math.Log2(1) produces the wrong result on Aarch64, because the Go math package is compiled to use a fused multiply-add instruction. This patch to the libgo configure script will use -ffp-contract=off when compiling the math package on processors other than x86. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu, not that that tests much. Committed to mainline. Thanks for this! If you are willing to go into battle enough to argue that libgcc should also be compiled with -ffp-contract=off (I did not have the stomach for that fight) then we'll be down to 1 check-go failure on aarch64 (which is peano -- due to the absence of split/copyable stacks and should probably xfail). Hmmm, what is it that fails with libgcc? Is there a bug report for it? https://code.google.com/p/go/issues/detail?id=7066 and then http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59714 I wanted to propose a version using Kahan's algorithm for the determinant as described in http://hal-ens-lyon.archives-ouvertes.fr/docs/00/78/57/86/PDF/Jeannerod_Louvet_Muller_final.pdf but I haven't gotten around to it... Cheers, mwh I agree that peano is likely to fail without split stacks. Ian
Re: [RFC] [PATCH, AARCH64] : Using standard patterns for stack protection.
On Wed, Feb 5, 2014 at 2:29 AM, Venkataramanan Kumar venkataramanan.ku...@linaro.org wrote: Hi Marcus, + ldr\\t%x2, %1\;str\\t%x2, %0\;mov\t%x2,0 + [(set_attr length 12)]) This pattern emits an opaque sequence of instructions that cannot be scheduled, is that necessary? Can we not expand individual instructions or at least split ? Almost all the ports emits a template of assembly instructions. I m not sure why they have to be generated this way. But usage of these pattern is to clear the register that holds canary value immediately after its usage. http://gcc.gnu.org/ml/gcc-patches/2005-06/msg01981.html answer the original question of why. It was a reply to the exact same question being asked here but about the rs6000 (PowerPC) patch. Thanks, Andrew Pinski -/* { dg-do compile { target i?86-*-* x86_64-*-* rs6000-*-* s390x-*-* } } */ +/* { dg-do compile { target stack_protection } } */ /* { dg-options -O2 -fstack-protector-strong } */ Do we need a new effective target test, why is the existing fstack_protector not appropriate? stack_protector does a run time test. It failed in cross compilation environment and these are compile only tests. Also I thought richard suggested me to add a new option for this. ref: http://gcc.gnu.org/ml/gcc-patches/2013-11/msg03358.html regards, Venkat. On 4 February 2014 21:39, Marcus Shawcroft marcus.shawcr...@gmail.com wrote: Hi Venkat, On 22 January 2014 16:57, Venkataramanan Kumar venkataramanan.ku...@linaro.org wrote: Hi Marcus, After we changed the frame growing direction (downwards) in Aarch64, the back-end now generates stack smashing set and test based on generic code available in GCC. But most of the ports (i386, spu, rs6000, s390, sh, sparc, tilepro and tilegx) define machine descriptions using standard pattern names 'stack_protect_set' and 'stack_protect_test'. This is done for both TLS model as well as global variable based stack guard model. + + ldr\\t%x2, %1\;str\\t%x2, %0\;mov\t%x2,0 + [(set_attr length 12)]) This pattern emits an opaque sequence of instructions that cannot be scheduled, is that necessary? Can we not expand individual instructions or at least split ? + ldr\t%x3, %x1\;ldr\t%x0, %x2\;eor\t%x0, %x3, %x0 + [(set_attr length 12)]) Likewise. -/* { dg-do compile { target i?86-*-* x86_64-*-* rs6000-*-* s390x-*-* } } */ +/* { dg-do compile { target stack_protection } } */ /* { dg-options -O2 -fstack-protector-strong } */ Do we need a new effective target test, why is the existing fstack_protector not appropriate? Cheers /Marcus