Re: Ping: [PATCH] Add implicit C linkage for win32-specific entry points
Jacek Caban sent this: http://gcc.gnu.org/ml/gcc-patches/2012-03/msg01987.html in response to this: http://gcc.gnu.org/ml/gcc-patches/2012-03/msg01986.html But it never got reviewed. Could you review and commit? No, I don't have approval rights here, you need a Windows maintainer (Kai). -- Eric Botcazou
Re: divide 64-bit by constant for 32-bit target machines
On 14/06/12 19:46, Dinar Temirbulatov wrote: Hi, OK for trunk? thanks, Dinar. I'm still not comfortable about the code bloat that this is likely to incurr at -O2. R. On Tue, Jun 12, 2012 at 11:00 AM, Paolo Bonzini bonz...@gnu.org wrote: Il 12/06/2012 08:52, Dinar Temirbulatov ha scritto: is safe? That is, that the underflows cannot produce a wrong result? [snip] Thanks very much! Paolo= ChangeLog.txt 2012-06-14 Dinar Temirbulatov dtemirbula...@gmail.com Alexey Kravets mr.kayr...@gmail.com Paolo Bonzini bonz...@gnu.org * config/arm/arm.c (arm_rtx_costs_1): Add cost estimate for the integer double-word division operation. * config/mips/mips.c (mips_rtx_costs): Extend cost estimate for the integer double-word division operation for 32-bit targets. * gcc/expmed.c (expand_mult_highpart_optab): Allow to generate the higher multipilcation product for unsigned double-word integers using 32-bit wide registers. 30.patch N¬n‡r¥ªíÂ)emçhÂyhi×¢w^™©Ý
Re: long long availability in host compiler (Re: constant that doesn't fit in 32bits in alpha.c)
But note that in libdecnumber we have: 10de71e1 (meissner 2007-03-24 17:04:47 + 25) typedef unsigned int UINT32; 10de71e1 (meissner 2007-03-24 17:04:47 + 26) typedef unsigned long long UINT64; 10de71e1 (meissner 2007-03-24 17:04:47 + 27) typedef struct { UINT64 w[2]; } UINT128; ... 10de71e1 (meissner 2007-03-24 17:04:47 +28) { { 0x3b645a1cac083127ull, 0x0083126e978d4fdfull } }, /* 3 extra digits */ 10de71e1 (meissner 2007-03-24 17:04:47 +29) { { 0x4af4f0d844d013aaULL, 0x00346dc5d6388659ULL } }, /* 10^(-4) * 2^131 */ ^^ Generally speaking, I'd avoid taking anything in libdecnumber as an example. -- Eric Botcazou
Re: Change the ordering of cdce pass
On Fri, Jun 15, 2012 at 3:40 AM, Easwaran Raman era...@google.com wrote: ChangeLog entry has a gcc/ prefix that shouldn't be there. Here is the revised entry: 2012-06-14 Easwaran Raman era...@google.com * passes.c (init_optimization_passes): Remove pass_call_cdce from its current position and insert after pass_dce. Ok. Thanks, Richard. On Thu, Jun 14, 2012 at 6:38 PM, Easwaran Raman era...@google.com wrote: The conditional dead call elimination pass shrink wraps certain dead calls to math functions. It doesn't handle case like this: D.142420_139 = powD.549 (D.142421_138, D.142419_132); fooD.120935.barD.113815 = D.142420_139; # foo.bar is dead here. This code gets cleaned up by DCE and leaves only pow, which can then be shrink-wrapped by cdce. So it seems reasonable to do this reordering. Bootstraps on x86_64 on linux with no test regression. OK for trunk? - Easwaran -- 2012-06-14 Easwaran Raman era...@google.com * gcc/passes.c (init_optimization_passes): Remove pass_call_cdce from its current position and insert after pass_dce. Index: gcc/passes.c === --- gcc/passes.c (revision 188535) +++ gcc/passes.c (working copy) @@ -1374,7 +1374,6 @@ init_optimization_passes (void) NEXT_PASS (pass_complete_unrolli); NEXT_PASS (pass_ccp); NEXT_PASS (pass_forwprop); - NEXT_PASS (pass_call_cdce); /* pass_build_alias is a dummy pass that ensures that we execute TODO_rebuild_alias at this point. Re-building alias information also rewrites no longer addressed @@ -1387,6 +1386,7 @@ init_optimization_passes (void) NEXT_PASS (pass_merge_phi); NEXT_PASS (pass_vrp); NEXT_PASS (pass_dce); + NEXT_PASS (pass_call_cdce); NEXT_PASS (pass_cselim); NEXT_PASS (pass_tree_ifcombine); NEXT_PASS (pass_phiopt);
Re: [PATCH] Add option for dumping to stderr (issue6190057)
On Fri, Jun 15, 2012 at 7:47 AM, Sharad Singhai sing...@google.com wrote: On Wed, Jun 13, 2012 at 4:48 AM, Richard Guenther richard.guent...@gmail.com wrote: On Fri, Jun 8, 2012 at 7:16 AM, Sharad Singhai sing...@google.com wrote: Okay, I have updated the attached patch so that the output from -ftree-vectorizer-verbose is considered diagnostic information and is always sent to stderr. Other functionality remains unchanged. Here is some more context about this patch. This patch improves the dump infrastructure and public interfaces so that the existing private pass-specific dump stream is separated from the diagnostic dump stream (typically stderr). The optimization passes can output information on the two streams independently. The newly defined interfaces are: Individual passes do not need to access the dump file directly. Thus Instead of doing if (dump_file (flags dump_flags)) fprintf (dump_file, ...); they can do dump_printf (flags, ...); If the current pass has FLAGS enabled then the information gets printed into the dump file otherwise not. Similar to the dump_printf (), another function is defined, called diag_printf (dump_flags, ...) This prints information only onto the diagnostic stream, typically standard error. It is useful for separating pass-specific dump information from the diagnostic information. Currently, as a proof of concept, I have converted vectorizer passes to use the new dump format. For this, I have considered information printed in vect_dump file as diagnostic. Thus 'fprintf' calls are changed to 'diag_printf'. Some other information printed to dump_file is sent to the regular dump file via 'dump_printf ()'. It helps to separate the two streams because they might serve different purposes and might have different formatting requirements. For example, using the trunk compiler, the following invocation g++ -S v.cc -ftree-vectorize -fdump-tree-vect -ftree-vectorizer-verbose=2 prints tree vectorizer dump into a file named 'v.cc.113t.vect'. However, the verbose diagnostic output is silently ignored. This is not desirable as the two types of dump should not interfere. After this patch, the vectorizer dump is available in 'v.cc.113t.vect' as before, but the verbose vectorizer diagnostic is additionally printed on stderr. Thus both types of dump information are output. An additional feature of this patch is that individual passes can print dump information into command-line named files instead of auto numbered filename. For example, I'd wish you'd leave out this part for a followup. g++ -S -O2 v.cc -ftree-vectorize -fdump-tree-vect=foo.vect -ftree-vectorizer-verbose=2 -fdump-tree-pre=foo.pre This prints the tree vectorizer dump into 'foo.vect', PRE dump into 'foo.pre', and the vectorizer verbose diagnostic dump onto stderr. Please take another look. --- tree-vect-loop-manip.c (revision 188325) +++ tree-vect-loop-manip.c (working copy) @@ -789,14 +789,11 @@ slpeel_make_loop_iterate_ntimes (struct loop *loop gsi_remove (loop_cond_gsi, true); loop_loc = find_loop_location (loop); - if (dump_file (dump_flags TDF_DETAILS)) - { - if (loop_loc != UNKNOWN_LOC) - fprintf (dump_file, \nloop at %s:%d: , + if (loop_loc != UNKNOWN_LOC) + dump_printf (TDF_DETAILS, \nloop at %s:%d: , LOC_FILE (loop_loc), LOC_LINE (loop_loc)); - print_gimple_stmt (dump_file, cond_stmt, 0, TDF_SLIM); - } - + if (dump_flags TDF_DETAILS) + dump_gimple_stmt (TDF_SLIM, cond_stmt, 0); loop-nb_iterations = niters; I'm confused by this. Why is this not simply if (loop_loc != UNKNOWN_LOC) dump_printf (dump_flags, \nloop at %s:%d: , LOC_FILE (loop_loc), LOC_LINE (loop_loc)); dump_gimple_stmt (dump_flags | TDF_SLIM, cond_stmt, 0); for example. I notice that you maybe mis-understood the message classification I asked you to add (maybe I confused you by mentioning to eventually re-use the TDF_* flags). I think you basically provided this message classification by adding two classes by providing both dump_gimple_stmt and diag_gimple_stmt. But still in the above you keep a dump_flags test _and_ you pass in (altered) dump_flags to the dump/diag_gimple_stmt routines. Let me quote them: +void +dump_gimple_stmt (int flags, gimple gs, int spc) +{ + if (dump_file) + print_gimple_stmt (dump_file, gs, spc, flags); +} +void +diag_gimple_stmt (int flags, gimple gs, int spc) +{ + if (alt_dump_file) + print_gimple_stmt (alt_dump_file, gs, spc, flags); +} I'd say it should have been a single function: void dump_gimple_stmt (enum msg_classification, int additional_flags, gimple gs, int spc) { if (msg_classification go-to-dumpfile dump_file) print_gimple_stmt (dump_file, gs, spc, dump_flags | additional_flags); if (msg_classification go-to-alt-dump-file
Re: [Patch, ARM][1/8] Epilogue in RTL: update ldm_stm_operation_p
On 31/05/12 14:50, Greta Yorsh wrote: This patch updates ldm_stm_operation_p to check for loads that if SP is in the register list, then the base register is SP. It guarantees that SP is reset correctly when an LDM instruction is interrupted. Otherwise, we might end up with a corrupt stack. ChangeLog: gcc 2012-05-31 Greta Yorsh greta.yo...@arm.com * config/arm/arm.c (ldm_stm_operation_p): Require SP as base register for loads if SP is in the register list. 1-update-predicate.patch.txt diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index e3290e2..4717725 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -10247,6 +10247,12 @@ ldm_stm_operation_p (rtx op, bool load, enum machine_mode mode, if (!REG_P (addr)) return false; + /* Don't allow SP to be loaded unless it is also the base register. It + guarantees that SP is reset correctly when an LDM instruction + is interruptted. Otherwise, we might end up with a corrupt stack. */ + if (load (REGNO (reg) == SP_REGNUM) (REGNO (addr) != SP_REGNUM)) +return false; + for (; i count; i++) { elt = XVECEXP (op, 0, i); @@ -10270,6 +10276,10 @@ ldm_stm_operation_p (rtx op, bool load, enum machine_mode mode, || (consecutive (REGNO (reg) != (unsigned int) (first_regno + regs_per_val * (i - base + /* Don't allow SP to be loaded unless it is also the base register. It + guarantees that SP is reset correctly when an LDM instruction + is interrupted. Otherwise, we might end up with a corrupt stack. */ + || (load (REGNO (reg) == SP_REGNUM) (REGNO (addr) != SP_REGNUM)) || !MEM_P (mem) || GET_MODE (mem) != mode || ((GET_CODE (XEXP (mem, 0)) != PLUS OK. R.
Re: [Patch, ARM][2/8] Epilogue in RTL: new patterns for int regs
On 31/05/12 14:53, Greta Yorsh wrote: This patch adds new define_insn patterns for epilogue with integer registers. The patterns can handle pop multiple with writeback and return (loading into PC directly). To handle return, the patterns use a new special predicate pop_multiple_return, that uses ldm_stm_operation_p function from a previous patch. To output assembly, the patterns use a new function arm_output_multireg_pop. This patch also adds a new function arm_emit_multi_reg_pop that emits RTL that matches the new pop patterns for integer registers. This is a helper function for epilogue expansion. It is used by a later patch. ChangeLog: gcc 2012-05-31 Ian Bolton ian.bol...@arm.com Sameera Deshpande sameera.deshpa...@arm.com Greta Yorsh greta.yo...@arm.com * config/arm/arm.md (load_multiple_with_writeback) New define_insn. (load_multiple, pop_multiple_with_writeback_and_return) Likewise. (pop_multiple_with_return, ldr_with_return) Likewise. * config/arm/predicates.md (pop_multiple_return) New special predicate. * config/arm/arm-protos.h (arm_output_multireg_pop) New declaration. * config/arm/arm.c (arm_output_multireg_pop) New function. (arm_emit_multi_reg_pop): New function. (ldm_stm_operation_p): Check SP in the register list. 2-patterns.patch.txt diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 4717725..9093801 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -13815,6 +13815,84 @@ vfp_output_fldmd (FILE * stream, unsigned int base, int reg, int count) } +/* OPERANDS[0] is the entire list of insns that constitute pop, + OPERANDS[1] is the base register, RETURN_PC is true iff return insn + is in the list, UPDATE is true iff the list contains explicit + update of base register. + */ Close of comment should not be on a separate line. +void +arm_output_multireg_pop (rtx *operands, bool return_pc, rtx cond, bool reverse, + bool update) + offset += return_pc ? 1 : 0; + + /* Is the base register in the list? */ Two spaces at end of comment before */. + for (i = offset; i num_saves; i++) +{ + regno = REGNO (XEXP (XVECEXP (operands[0], 0, i), 0)); + /* If SP is in the list, then the base register must be SP. */ And here. + gcc_assert ((regno != SP_REGNUM) || (regno_base == SP_REGNUM)); + /* If base register is in the list, there must be no explicit update. */ + if (regno == regno_base) +gcc_assert (!update); +} + + conditional = reverse ? %?%D0 : %?%d0; + if ((regno_base == SP_REGNUM) TARGET_UNIFIED_ASM) +{ + /* Output pop (not stmfd) because it has a shorter encoding. */ And here. + gcc_assert (update); + sprintf (pattern, pop%s\t{, conditional); +} + else +{ + /* Output ldmfd when the base register is SP, otherwise output ldmia. + It's just a convention, their semantics are identical. */ + if (regno_base == SP_REGNUM) +sprintf (pattern, ldm%sfd\t, conditional); + else if (TARGET_UNIFIED_ASM) +sprintf (pattern, ldmia%s\t, conditional); + else +sprintf (pattern, ldm%sia\t, conditional); + + strcat (pattern, reg_names[regno_base]); + if (update) +strcat (pattern, !, {); + else +strcat (pattern, , {); +} + + /* Output the first destination register. */ And here. diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index ed33c9b..862ccf4 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -10959,6 +10959,89 @@ [(set_attr type f_fpa_store)] ) +;; Pop (as used in epilogue RTL) +;; +(define_insn *load_multiple_with_writeback + [(match_parallel 0 load_multiple_operation +[(set (match_operand:SI 1 s_register_operand +rk) + (plus:SI (match_dup 1) + (match_operand:SI 2 const_int_operand I))) + (set (match_operand:SI 3 s_register_operand =rk) + (mem:SI (match_dup 1))) +])] + TARGET_32BIT (reload_in_progress || reload_completed) + * + { +arm_output_multireg_pop (operands, /*return_pc=*/FALSE, + /*cond=*/const_true_rtx, + /*reverse=*/FALSE, + /*update=*/TRUE); Use lower case for TRUE and FALSE. Several instances later on as well. OK with those changes. R.
Re: [PATCH] Small tree-vect-pattern.c cleanup
On Thu, Jun 14, 2012 at 11:00 PM, Jakub Jelinek ja...@redhat.com wrote: Hi! While looking at pattern recognizer, I've noticed that we needlessly allocate a single member array from heap. An automatic variable for that would be fine, but BB_VINFO_BB is also addressable. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok. Thanks, Richard. 2012-06-14 Jakub Jelinek ja...@redhat.com * tree-vect-patterns.c (vect_pattern_recog): Don't unnecessarily allocate and free bbs array for the SLP case. --- gcc/tree-vect-patterns.c.jj 2012-06-14 13:22:27.0 +0200 +++ gcc/tree-vect-patterns.c 2012-06-14 15:33:16.335453016 +0200 @@ -2983,7 +2983,7 @@ void vect_pattern_recog (loop_vec_info loop_vinfo, bb_vec_info bb_vinfo) { struct loop *loop; - basic_block *bbs, bb; + basic_block *bbs; unsigned int nbbs; gimple_stmt_iterator si; unsigned int i, j; @@ -3002,10 +3002,8 @@ vect_pattern_recog (loop_vec_info loop_v } else { - bb = BB_VINFO_BB (bb_vinfo); + bbs = BB_VINFO_BB (bb_vinfo); nbbs = 1; - bbs = XNEW (basic_block); - bbs[0] = bb; } /* Scan through the loop stmts, applying the pattern recognition @@ -3031,6 +3029,4 @@ vect_pattern_recog (loop_vec_info loop_v } VEC_free (gimple, heap, stmts_to_replace); - if (bb_vinfo) - free (bbs); } Jakub
Re: [patch] Fix PR middle-end/53590
If we don't do it, we'll get another PR saying this works in LTO mode with other versions of the Ada compiler (which is true) so I'll proceed. Here is what I've installed after bootstrapping/regtesting on x86_64-suse-linux and i586-suse-linux. It adds the flag to 'struct function' and streams it. While I was at it, I also changed the trivially-dead-insns machinery in cse.c and dse.c to use the same predicate as dce.c, namely !insn_nothrow_p instead of insn_could_throw_p. The former is more precise and well suited to these optimization passes. cfgexpand.c and reload1.c keep using the latter when they are massaging the RTL stream. PR middle-end/53590 * common.opt (-fdelete-dead-exceptions): New switch. * doc/invoke.texi (Code Gen Options): Document it. * cse.c (count_reg_usage) CALL_INSN: Use !insn_nothrow_p in lieu of insn_could_throw_p predicate. Do not skip an insn that could throw if dead exceptions can be deleted. (insn_live_p): Likewise, do not return true in that case. * dce.c (can_alter_cfg): New flag. (deletable_insn_p): Do not return false for an insn that can throw if the CFG can be altered and dead exceptions can be deleted. (init_dce): Set can_alter_cfg to false for fast DCE, true otherwise. * dse.c (scan_insn): Use !insn_nothrow_p in lieu of insn_could_throw_ predicate. Do not preserve an insn that could throw if dead exceptions can be deleted. * function.h (struct function): Add can_delete_dead_exceptions flag. * function.c (allocate_struct_function): Set it. * lto-streamer-in.c (input_struct_function_base): Stream it. * lto-streamer-out.c (input_struct_function_base): Likewise. * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Do not mark a statement that could throw as necessary if dead exceptions can be deleted. ada/ * gcc-interface/misc.c (gnat_init_options_struct): Set opts-x_flag_delete_dead_exceptions to 1. -- Eric Botcazou Index: doc/invoke.texi === --- doc/invoke.texi (revision 188647) +++ doc/invoke.texi (working copy) @@ -975,7 +975,7 @@ See S/390 and zSeries Options. @xref{Code Gen Options,,Options for Code Generation Conventions}. @gccoptlist{-fcall-saved-@var{reg} -fcall-used-@var{reg} @gol -ffixed-@var{reg} -fexceptions @gol --fnon-call-exceptions -funwind-tables @gol +-fnon-call-exceptions -fdelete-dead-exceptions -funwind-tables @gol -fasynchronous-unwind-tables @gol -finhibit-size-directive -finstrument-functions @gol -finstrument-functions-exclude-function-list=@var{sym},@var{sym},@dots{} @gol @@ -19317,6 +19317,14 @@ instructions to throw exceptions, i.e.@: instructions. It does not allow exceptions to be thrown from arbitrary signal handlers such as @code{SIGALRM}. +@item -fdelete-dead-exceptions +@opindex fdelete-dead-exceptions +Consider that instructions that may throw exceptions but don't otherwise +contribute to the execution of the program can be optimized away. +This option is enabled by default for the Ada front end, as permitted by +the Ada language specification. +Optimization passes that cause dead exceptions to be removed are enabled independently at different optimization levels. + @item -funwind-tables @opindex funwind-tables Similar to @option{-fexceptions}, except that it just generates any needed Index: cse.c === --- cse.c (revision 188647) +++ cse.c (working copy) @@ -599,7 +599,6 @@ static void invalidate_from_clobbers (rt static void invalidate_from_sets_and_clobbers (rtx); static rtx cse_process_notes (rtx, rtx, bool *); static void cse_extended_basic_block (struct cse_basic_block_data *); -static void count_reg_usage (rtx, int *, rtx, int); static int check_for_label_ref (rtx *, void *); extern void dump_class (struct table_elt*); static void get_cse_reg_info_1 (unsigned int regno); @@ -6692,10 +6691,11 @@ count_reg_usage (rtx x, int *counts, rtx case CALL_INSN: case INSN: case JUMP_INSN: - /* We expect dest to be NULL_RTX here. If the insn may trap, + /* We expect dest to be NULL_RTX here. If the insn may throw, or if it cannot be deleted due to side-effects, mark this fact by setting DEST to pc_rtx. */ - if (insn_could_throw_p (x) || side_effects_p (PATTERN (x))) + if ((!cfun-can_delete_dead_exceptions !insn_nothrow_p (x)) + || side_effects_p (PATTERN (x))) dest = pc_rtx; if (code == CALL_INSN) count_reg_usage (CALL_INSN_FUNCTION_USAGE (x), counts, dest, incr); @@ -6800,7 +6800,7 @@ static bool insn_live_p (rtx insn, int *counts) { int i; - if (insn_could_throw_p (insn)) + if (!cfun-can_delete_dead_exceptions !insn_nothrow_p (insn)) return true; else if (GET_CODE (PATTERN (insn)) == SET) return set_live_p (PATTERN
Re: long long availability in host compiler (Re: constant that doesn't fit in 32bits in alpha.c)
On 06/15/2012 09:12 AM, Eric Botcazou wrote: Generally speaking, I'd avoid taking anything in libdecnumber as an example. It's not about example, but the fact that host compilers have been compiling that code as part of building gcc for years, without anyone complaining, afaik. It doesn't matter whether the code pointed at is the ugliest or most beautiful code on earth. What matters is whether it uses long long unconditionally on all hosts or not. IOW, what are the still supported hosts/compilers that don't support long long? If there are any, it appears none has been used in at least the past 5 years, IIU the code correctly. (This is not just an unfounded, OOC, question. We just recently went through the exercise of coming up with an interface for an include/ header, http://gcc.gnu.org/ml/gcc-patches/2012-05/msg01424.html http://sourceware.org/ml/binutils/2012-05/msg00344.html where we had some back and forth on the use of long long. After all that, we ended up finding that libdecnumber uses long long unconditionally, http://sourceware.org/ml/gdb-patches/2012-05/msg01078.html so in practice, GDB has been relying on long long existing for as long as libdecnumber has been used in GDB. The same should hold true for gcc.) -- Pedro Alves
[Patch 4.6] In system.h, wrap include of C++ header in 'extern C++'
My plugin is written in C++. When including headers from gcc-4.6 it wraps them in 'extern C' to prevent name mangling. Some of the plugin headers include gcc/system.h which includes the C++ header cstring if it detects the use of a C++ compiler. As a result cstring routines included this way end up wrapped in 'extern C', while those included directly from C++ aren't 'extern C'. This doesn't worry g++, but clang gets upset, erroring out with a complaint about multiple inconsistent declarations of memchr and friends. Is the following patch OK to apply to gcc-4.6? And is it in principle OK to apply to gcc-4.7 (I didn't test it there yet)? It would be useful if gcc-4.7 is compiled as C. Thanks, Duncan. Index: gcc/system.h === --- gcc/system.h(revision 188518) +++ gcc/system.h(working copy) @@ -191,7 +191,9 @@ #endif #ifdef __cplusplus +extern C++ { # include cstring +} #endif /* Some of glibc's string inlines cause warnings. Plus we'd rather
Re: [patch] Fix PR middle-end/53590
On Fri, Jun 15, 2012 at 11:13 AM, Eric Botcazou ebotca...@adacore.com wrote: If we don't do it, we'll get another PR saying this works in LTO mode with other versions of the Ada compiler (which is true) so I'll proceed. Here is what I've installed after bootstrapping/regtesting on x86_64-suse-linux and i586-suse-linux. It adds the flag to 'struct function' and streams it. Btw, I think we should enable this flag by default for all languages but Java so that if you enable -fnon-call-exceptions for C or C++ you don't get too many spurious exceptions from dead code. Thanks, Richard. While I was at it, I also changed the trivially-dead-insns machinery in cse.c and dse.c to use the same predicate as dce.c, namely !insn_nothrow_p instead of insn_could_throw_p. The former is more precise and well suited to these optimization passes. cfgexpand.c and reload1.c keep using the latter when they are massaging the RTL stream. PR middle-end/53590 * common.opt (-fdelete-dead-exceptions): New switch. * doc/invoke.texi (Code Gen Options): Document it. * cse.c (count_reg_usage) CALL_INSN: Use !insn_nothrow_p in lieu of insn_could_throw_p predicate. Do not skip an insn that could throw if dead exceptions can be deleted. (insn_live_p): Likewise, do not return true in that case. * dce.c (can_alter_cfg): New flag. (deletable_insn_p): Do not return false for an insn that can throw if the CFG can be altered and dead exceptions can be deleted. (init_dce): Set can_alter_cfg to false for fast DCE, true otherwise. * dse.c (scan_insn): Use !insn_nothrow_p in lieu of insn_could_throw_ predicate. Do not preserve an insn that could throw if dead exceptions can be deleted. * function.h (struct function): Add can_delete_dead_exceptions flag. * function.c (allocate_struct_function): Set it. * lto-streamer-in.c (input_struct_function_base): Stream it. * lto-streamer-out.c (input_struct_function_base): Likewise. * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Do not mark a statement that could throw as necessary if dead exceptions can be deleted. ada/ * gcc-interface/misc.c (gnat_init_options_struct): Set opts-x_flag_delete_dead_exceptions to 1. -- Eric Botcazou
Re: [Patch 4.6] In system.h, wrap include of C++ header in 'extern C++'
On Fri, Jun 15, 2012 at 11:27 AM, Duncan Sands baldr...@free.fr wrote: My plugin is written in C++. When including headers from gcc-4.6 it wraps them in 'extern C' to prevent name mangling. Some of the plugin headers include gcc/system.h which includes the C++ header cstring if it detects the use of a C++ compiler. As a result cstring routines included this way end up wrapped in 'extern C', while those included directly from C++ aren't 'extern C'. This doesn't worry g++, but clang gets upset, erroring out with a complaint about multiple inconsistent declarations of memchr and friends. Is the following patch OK to apply to gcc-4.6? And is it in principle OK to apply to gcc-4.7 (I didn't test it there yet)? It would be useful if gcc-4.7 is compiled as C. Uh, I don't think we should do that. Why do we include cstring here anyways? Ian - you added this include in rev. 167764, I don't think that was proper. But I'm not sure wrapping a system.h include inside extern C from a C++ plugin is proper either ... Thanks, Richard. Thanks, Duncan. Index: gcc/system.h === --- gcc/system.h (revision 188518) +++ gcc/system.h (working copy) @@ -191,7 +191,9 @@ #endif #ifdef __cplusplus +extern C++ { # include cstring +} #endif /* Some of glibc's string inlines cause warnings. Plus we'd rather
Re: long long availability in host compiler (Re: constant that doesn't fit in 32bits in alpha.c)
It's not about example, but the fact that host compilers have been compiling that code as part of building gcc for years, without anyone complaining, afaik. It doesn't matter whether the code pointed at is the ugliest or most beautiful code on earth. What matters is whether it uses long long unconditionally on all hosts or not. IOW, what are the still supported hosts/compilers that don't support long long? If there are any, it appears none has been used in at least the past 5 years, IIU the code correctly. OK, but GCC still officially requires only an ISO C90 compiler http://gcc.gnu.org/install/prerequisites.html so the usage of 'long long' in libdecnumber is a bug that could be fixed at some point. That's why using it as a precedent isn't the best thing to do. -- Eric Botcazou
Re: [PATCH, GCC][AArch64] Update LINK_SPEC
On 14/06/12 13:24, Sofiane Naci wrote: Hi, This patch updates LINK_SPEC in the AArch64 port. Thanks Sofiane - 2012-06-14 Sofiane Nacisofiane.n...@arm.com [AArch64] Update LINK_SPEC. * config/aarch64/aarch64-linux.h (LINUX_TARGET_LINK_SPEC): Remove %{version:-v}, %{b} and %{!dynamic-linker}. OK
Re: [patch] Fix PR middle-end/53590
Btw, I think we should enable this flag by default for all languages but Java so that if you enable -fnon-call-exceptions for C or C++ you don't get too many spurious exceptions from dead code. The flag isn't formally tied to -fnon-call-exceptions though, so there might be subtleties in C++, but I agree that in practice this should work fine. -- Eric Botcazou
Re: [Patch 4.6] In system.h, wrap include of C++ header in 'extern C++'
Hi Richard, Uh, I don't think we should do that. Why do we include cstring here anyways? Ian - you added this include in rev. 167764, I don't think that was proper. But I'm not sure wrapping a system.h include inside extern C from a C++ plugin is proper either ... since the plugin needs to call GCC routines, and GCC is built as C, it has to wrap at least some GCC headers in extern C to avoid mangling of the names of those GCC routines (otherwise you can't load the plugin because the linker will look for the mangled names in GCC and not find them). But perhaps you know a trick to avoid the name mangling problem? It is true that maybe via a careful dance it is possible to not wrap system.h in extern C - I will give it a go. Ciao, Duncan. Thanks, Richard. Thanks, Duncan. Index: gcc/system.h === --- gcc/system.h(revision 188518) +++ gcc/system.h(working copy) @@ -191,7 +191,9 @@ #endif #ifdef __cplusplus +extern C++ { # includecstring +} #endif /* Some of glibc's string inlines cause warnings. Plus we'd rather
Re: Ping: [PATCH] Add implicit C linkage for win32-specific entry points
2012/6/15 Eric Botcazou ebotca...@adacore.com: Jacek Caban sent this: http://gcc.gnu.org/ml/gcc-patches/2012-03/msg01987.html in response to this: http://gcc.gnu.org/ml/gcc-patches/2012-03/msg01986.html But it never got reviewed. Could you review and commit? No, I don't have approval rights here, you need a Windows maintainer (Kai). -- Eric Botcazou The patch is from my point of view ok. We need here for the introduction of CPP_IMPLICIT_TARGET_CLANG and its use in cp/decl.c the approval of a C++ maintainer (jason?). One nit I have about the ChangeLog entry. The C++ change needs a separate ChangeLog entry under cp/. Regards, Kai
Re: long long availability in host compiler (Re: constant that doesn't fit in 32bits in alpha.c)
On 15/06/12 10:48, Eric Botcazou wrote: It's not about example, but the fact that host compilers have been compiling that code as part of building gcc for years, without anyone complaining, afaik. It doesn't matter whether the code pointed at is the ugliest or most beautiful code on earth. What matters is whether it uses long long unconditionally on all hosts or not. IOW, what are the still supported hosts/compilers that don't support long long? If there are any, it appears none has been used in at least the past 5 years, IIU the code correctly. OK, but GCC still officially requires only an ISO C90 compiler http://gcc.gnu.org/install/prerequisites.html so the usage of 'long long' in libdecnumber is a bug that could be fixed at some point. That's why using it as a precedent isn't the best thing to do. There are several ports that currently require long long support in the back-end -- see need_64bit_hwint in config.gcc. R.
Re: [Patch, ARM][3/8] Epilogue in RTL: new patterns for vfp regs
On 31/05/12 14:55, Greta Yorsh wrote: New define insn pattern for epilogue with floating point registers (DFmode) and a new function that emits RTL for this pattern. This function is a helper for epilogue extension. It is used by a later patch. ChangeLog: gcc 2012-05-31 Ian Bolton ian.bol...@arm.com Sameera Deshpande sameera.deshpa...@arm.com Greta Yorsh greta.yo...@arm.com * config/arm/arm.md (vfp_pop_multiple_with_writeback) New define_insn. * config/arm/predicates.md (pop_multiple_fp) New special predicate. * config/arm/arm.c (arm_emit_vfp_multi_reg_pop): New function. OK. R.
Re: [Patch, ARM][4/8] Epilogue in RTL: expand epilogue for apcs frame
On 31/05/12 14:58, Greta Yorsh wrote: Helper function for epilogue expansion. Emit RTL for APCS frame epilogue (when -mapcs-frame command line option is specified). This function is used by a later patch. For APCS frame epilogue, the compiler currently generates LDM with SP as both the base register and one of the destination registers. For example: @ APCS_FRAME epilogue ldmfd sp, {r4, fp, sp, pc} @ non-APCS_FRAME epilogue ldmfd sp!, {r4, fp, pc} The use of SP in LDM register list is deprecated, but this patch does not address the problem. To generate the epilogue for APCS frame in RTL, this patch adds a new alternative to arm_addsi2 insn in ARM mode only to generate sub sp, fp, #imm. Previously, there was no pattern to generate sub with SP as the destination register and not SP as the operand register. ChangeLog: gcc 2012-05-31 Ian Bolton ian.bol...@arm.com Sameera Deshpande sameera.deshpa...@arm.com Greta Yorsh greta.yo...@arm.com * config/arm/arm.c (arm_expand_epilogue_apcs_frame): New function. * config/arm/arm.md (arm_addsi3) Add an alternative. The FPA support is now obsolete. Please remove that. OK with that change. R.
Re: [Patch, ARM][5/8] Epilogue in RTL: expand
On 31/05/12 14:59, Greta Yorsh wrote: The main function for epilogue RTL generation, used by expand epilogue patterns. ChangeLog: gcc 2012-05-31 Ian Bolton ian.bol...@arm.com Sameera Deshpande sameera.deshpa...@arm.com Greta Yorsh greta.yo...@arm.com * config/arm/arm-protos.h (arm_expand_epilogue): New declaration. * config/arm/arm.c (arm_expand_epilogue): New function. * config/arm/arm.md (epilogue): Update condition and code. (sibcall_epilogue): Likewise. Same as last patch, OK once the FPA support has been stripped out. R.
[Ada] Fix PR ada/53592
This is the ICE on the assignment to a component of a vector_type, which comes from the VIEW_CONVERT_EXPR generated to turn it into an array array. Now that VECTOR_TYPEs can be GIMPLE registers, this construct breaks. Fixed by marking the vector as addressable, as suggested by Richard. Tested on i586-suse-linux, applied on mainline and 4.7 branch. 2012-06-15 Eric Botcazou ebotca...@adacore.com PR ada/53592 * gcc-interface/gigi.h (maybe_vector_array): Make static inline. * gcc-interface/utils.c (maybe_vector_array): Delete. * gcc-interface/trans.c (gnat_to_gnu) N_Indexed_Component: Mark the array object as addressable if it has vector type and is on the LHS. 2012-06-15 Eric Botcazou ebotca...@adacore.com * gnat.dg/vect8.ad[sb]: New test. -- Eric Botcazou Index: gcc-interface/utils.c === --- gcc-interface/utils.c (revision 188647) +++ gcc-interface/utils.c (working copy) @@ -5149,20 +5149,6 @@ maybe_unconstrained_array (tree exp) return exp; } - -/* If EXP's type is a VECTOR_TYPE, return EXP converted to the associated - TYPE_REPRESENTATIVE_ARRAY. */ - -tree -maybe_vector_array (tree exp) -{ - tree etype = TREE_TYPE (exp); - - if (VECTOR_TYPE_P (etype)) -exp = convert (TYPE_REPRESENTATIVE_ARRAY (etype), exp); - - return exp; -} /* Return true if EXPR is an expression that can be folded as an operand of a VIEW_CONVERT_EXPR. See ada-tree.h for a complete rationale. */ Index: gcc-interface/gigi.h === --- gcc-interface/gigi.h (revision 188647) +++ gcc-interface/gigi.h (working copy) @@ -783,10 +783,6 @@ extern tree remove_conversions (tree exp likewise return an expression pointing to the underlying array. */ extern tree maybe_unconstrained_array (tree exp); -/* If EXP's type is a VECTOR_TYPE, return EXP converted to the associated - TYPE_REPRESENTATIVE_ARRAY. */ -extern tree maybe_vector_array (tree exp); - /* Return an expression that does an unchecked conversion of EXPR to TYPE. If NOTRUNC_P is true, truncation operations should be suppressed. */ extern tree unchecked_convert (tree type, tree expr, bool notrunc_p); @@ -1033,6 +1029,20 @@ extern void enumerate_modes (void (*f) ( /* Convenient shortcuts. */ #define VECTOR_TYPE_P(TYPE) (TREE_CODE (TYPE) == VECTOR_TYPE) +/* If EXP's type is a VECTOR_TYPE, return EXP converted to the associated + TYPE_REPRESENTATIVE_ARRAY. */ + +static inline tree +maybe_vector_array (tree exp) +{ + tree etype = TREE_TYPE (exp); + + if (VECTOR_TYPE_P (etype)) +exp = convert (TYPE_REPRESENTATIVE_ARRAY (etype), exp); + + return exp; +} + static inline unsigned HOST_WIDE_INT ceil_pow2 (unsigned HOST_WIDE_INT x) { Index: gcc-interface/trans.c === --- gcc-interface/trans.c (revision 188647) +++ gcc-interface/trans.c (working copy) @@ -5372,7 +5372,12 @@ gnat_to_gnu (Node_Id gnat_node) /* Convert vector inputs to their representative array type, to fit what the code below expects. */ - gnu_array_object = maybe_vector_array (gnu_array_object); + if (VECTOR_TYPE_P (TREE_TYPE (gnu_array_object))) + { + if (present_in_lhs_or_actual_p (gnat_node)) + gnat_mark_addressable (gnu_array_object); + gnu_array_object = maybe_vector_array (gnu_array_object); + } gnu_array_object = maybe_unconstrained_array (gnu_array_object); package body Vect8 is function Foo (V : Vec) return Vec is Ret : Vec; begin Ret (1) := V (1) + V (2); Ret (2) := V (1) - V (2); return Ret; end; end Vect8; -- { dg-do compile } package Vect8 is type Vec is array (1 .. 2) of Long_Float; pragma Machine_Attribute (Vec, vector_type); function Foo (V : Vec) return Vec; end Vect8;
Re: [Patch, ARM][6/8] Epilogue in RTL: simple return
On 31/05/12 15:02, Greta Yorsh wrote: Add a new parameter to the function output_return_instruction to handle simple cases of return when no epilogue needs to be printed out. ChangeLog: gcc 2012-05-31 Ian Bolton ian.bol...@arm.com Sameera Deshpande sameera.deshpa...@arm.com Greta Yorsh greta.yo...@arm.com * config/arm/arm-protos.h (output_return_instruction): New parameter. * config/arm/arm.c (output_return_instruction): New parameter. * config/arm/arm.md (arm_simple_return): New pattern. (arm_return, cond_return, cond_return_inverted): Add new arguments. * config/arm/thumb2.md (thumb2_return): Update condition and code. Since you're chaning output_return_instruction, please update it to use the bool type for the flags; then modify the callers to use 'true' and 'false' rather than 'TRUE' and 'FALSE'. OK with that change. R.
Re: [Patch, ARM][7/8] Epilogue in RTL: expand thumb2 return
On 31/05/12 15:04, Greta Yorsh wrote: Generate RTL for return in Thumb2 mode. Used by expand of return insn. ChangeLog: gcc 2012-05-31 Ian Bolton ian.bol...@arm.com Sameera Deshpande sameera.deshpa...@arm.com Greta Yorsh greta.yo...@arm.com * config/arm/arm-protos.h (thumb2_expand_return): New declaration. * config/arm/arm.c (thumb2_expand_return): New function. * config/arm/arm.md (return): Update condition and code. OK. R.
Re: long long availability in host compiler (Re: constant that doesn't fit in 32bits in alpha.c)
There are several ports that currently require long long support in the back-end -- see need_64bit_hwint in config.gcc. Yes, all the 64-bit ports at least, but you shouldn't need 'long long' to build the compiler e.g. for the AVR. -- Eric Botcazou
[PATCH] New testcase
This adds a testcase I reduced from a genmodes miscompile with one of my pending VRP patches. Committed. Richard. 2012-06-15 Richard Guenther rguent...@suse.de * gcc.c-torture/execute/20120615-1.c: New testcase. Index: gcc/testsuite/gcc.c-torture/execute/20120615-1.c === --- gcc/testsuite/gcc.c-torture/execute/20120615-1.c(revision 0) +++ gcc/testsuite/gcc.c-torture/execute/20120615-1.c(revision 0) @@ -0,0 +1,16 @@ +extern void abort (void); + +void __attribute__((noinline,noclone)) + test1(int i) +{ + if (i == 12) +return; + if (i != 17) +{ + if (i == 15) + return; + abort (); +} +} + +int main() { test1 (15); return 0; }
Re: RFA: better gimplification of compound literals
Hi, On Thu, 14 Jun 2012, Richard Guenther wrote: Restarted regstrapping the thing on x86_64 again. Okay if that passes? Ok. But I wonder how the symtab cannot be ready when we gimplify - after all we gimplify only from after cgraph_finalize_compilation_unit ... Ready may have been the wrong word. There is no entry for the vtable object in the symtab at that point. Only for all the functions. It also never is generated later, so if there had ever been a folding for such statement before it would have crashed already without my patch. I don't know if that is by intention or an oversight. Ciao, Michael.
Re: [PR tree-optimization/52558]: RFC: questions on store data race
Whoops, I forgot to commit that last patch. Check now. The warning is there on the 4.7 branch now. -- Eric Botcazou
Re: [Patch 4.6] In system.h, wrap include of C++ header in 'extern C++'
On Fri, Jun 15, 2012 at 11:59 AM, Duncan Sands baldr...@free.fr wrote: Hi Richard, Uh, I don't think we should do that. Why do we include cstring here anyways? Ian - you added this include in rev. 167764, I don't think that was proper. But I'm not sure wrapping a system.h include inside extern C from a C++ plugin is proper either ... since the plugin needs to call GCC routines, and GCC is built as C, it has to wrap at least some GCC headers in extern C to avoid mangling of the names of those GCC routines (otherwise you can't load the plugin because the linker will look for the mangled names in GCC and not find them). But perhaps you know a trick to avoid the name mangling problem? It is true that maybe via a careful dance it is possible to not wrap system.h in extern C - I will give it a go. As system.h is supposed to only include system headers and do nothing else it has to be prepared to be included from C++ already, so no extern C wrapping should be necessary for it. Richard. Ciao, Duncan. Thanks, Richard. Thanks, Duncan. Index: gcc/system.h === --- gcc/system.h (revision 188518) +++ gcc/system.h (working copy) @@ -191,7 +191,9 @@ #endif #ifdef __cplusplus +extern C++ { # includecstring +} #endif /* Some of glibc's string inlines cause warnings. Plus we'd rather
Re: [PR tree-optimization/52558]: RFC: questions on store data race
On 06/15/12 06:40, Eric Botcazou wrote: Whoops, I forgot to commit that last patch. Check now. The warning is there on the 4.7 branch now. Arghhh, that's the second time. I wonder why the warning doesn't show up on my bootstraps. Anyway, committed the attached patch to branch. Backport from mainline: 2012-05-31 Aldy Hernandez al...@redhat.com * tree-ssa-loop-im.c (execute_sm): Do not check flag_tm. * gimple.h (block_in_transaction): Check for flag_tm. Index: tree-ssa-loop-im.c === --- tree-ssa-loop-im.c (revision 188631) +++ tree-ssa-loop-im.c (working copy) @@ -2154,7 +2154,7 @@ execute_sm (struct loop *loop, VEC (edge fmt_data.orig_loop = loop; for_each_index (ref-mem, force_move_till, fmt_data); - if ((flag_tm block_in_transaction (loop_preheader_edge (loop)-src)) + if (block_in_transaction (loop_preheader_edge (loop)-src) || !PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES)) multi_threaded_model_p = true; Index: gimple.h === --- gimple.h(revision 188631) +++ gimple.h(working copy) @@ -1587,7 +1587,7 @@ gimple_set_has_volatile_ops (gimple stmt static inline bool block_in_transaction (basic_block bb) { - return bb-flags BB_IN_TRANSACTION; + return flag_tm bb-flags BB_IN_TRANSACTION; } /* Return true if STMT is in a transaction. */
Re: RFA: better gimplification of compound literals
On Thu, Jun 14, 2012 at 5:33 PM, Michael Matz m...@suse.de wrote: Hi, On Thu, 14 Jun 2012, Michael Matz wrote: In any case, this patch is currently in regstrapping on x86-64. Okay if it passes (modulo changes for the above symtab_get_node() issue)? After discussion with Honza, consider the patch changed like so: if (!from_decl || TREE_CODE (from_decl) != VAR_DECL || !DECL_EXTERNAL (from_decl) - || (symtab_get_node (from_decl)-symbol.in_other_partition)) + || (flag_ltrans + symtab_get_node (from_decl)-symbol.in_other_partition)) return true; Restarted regstrapping the thing on x86_64 again. Okay if that passes? Ok. But I wonder how the symtab cannot be ready when we gimplify - after all we gimplify only from after cgraph_finalize_compilation_unit ... We build nodies for external declarations only after we see references in them and we build references from gimplified bodies. So at this time symtab has all defined symbols but only some external as the symtab construction goes by. Honza
Re: [Patch 4.6] In system.h, wrap include of C++ header in 'extern C++'
Hi Richard, As system.h is supposed to only include system headers and do nothing else it has to be prepared to be included from C++ already, so no extern C wrapping should be necessary for it. it defines fancy_abort. Not wrapping system.h in extern C results in undefined symbol: _Z11fancy_abortPKciS0_ when loading the plugin. Ciao, Duncan.
[PATCH][3/n] VRP and anti-range handling
This makes set_and_canonicalize_value_range more consistent. To be used in further patches. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. 2012-06-15 Richard Guenther rguent...@suse.de * tree-vrp.c (set_and_canonicalize_value_range): Use canonical predicates to set VR_UNDEFINED and VR_VARYING. Drop a case we assert for in set_value_range to VR_VARYING. Index: gcc/tree-vrp.c === *** gcc/tree-vrp.c.orig 2012-06-15 13:50:01.929912328 +0200 --- gcc/tree-vrp.c 2012-06-15 13:50:30.354911342 +0200 *** nonnull_arg_p (const_tree arg) *** 386,391 --- 386,403 } + /* Set value range VR to VR_UNDEFINED. */ + + static inline void + set_value_range_to_undefined (value_range_t *vr) + { + vr-type = VR_UNDEFINED; + vr-min = vr-max = NULL_TREE; + if (vr-equiv) + bitmap_clear (vr-equiv); + } + + /* Set value range VR to VR_VARYING. */ static inline void *** static void *** 463,472 set_and_canonicalize_value_range (value_range_t *vr, enum value_range_type t, tree min, tree max, bitmap equiv) { ! /* Nothing to canonicalize for symbolic or unknown or varying ranges. */ ! if ((t != VR_RANGE ! t != VR_ANTI_RANGE) ! || TREE_CODE (min) != INTEGER_CST || TREE_CODE (max) != INTEGER_CST) { set_value_range (vr, t, min, max, equiv); --- 475,494 set_and_canonicalize_value_range (value_range_t *vr, enum value_range_type t, tree min, tree max, bitmap equiv) { ! /* Use the canonical setters for VR_UNDEFINED and VR_VARYING. */ ! if (t == VR_UNDEFINED) ! { ! set_value_range_to_undefined (vr); ! return; ! } ! else if (t == VR_VARYING) ! { ! set_value_range_to_varying (vr); ! return; ! } ! ! /* Nothing to canonicalize for symbolic ranges. */ ! if (TREE_CODE (min) != INTEGER_CST || TREE_CODE (max) != INTEGER_CST) { set_value_range (vr, t, min, max, equiv); *** set_and_canonicalize_value_range (value_ *** 502,508 if (is_min is_max) { ! /* We cannot deal with empty ranges, drop to varying. */ set_value_range_to_varying (vr); return; } --- 524,531 if (is_min is_max) { ! /* We cannot deal with empty ranges, drop to varying. !??? This could be VR_UNDEFINED instead. */ set_value_range_to_varying (vr); return; } *** set_and_canonicalize_value_range (value_ *** 525,530 --- 548,562 } } + /* Drop [-INF(OVF), +INF(OVF)] to varying. */ + if (needs_overflow_infinity (TREE_TYPE (min)) +is_overflow_infinity (min) +is_overflow_infinity (max)) + { + set_value_range_to_varying (vr); + return; + } + set_value_range (vr, t, min, max, equiv); } *** set_value_range_to_truthvalue (value_ran *** 608,625 } - /* Set value range VR to VR_UNDEFINED. */ - - static inline void - set_value_range_to_undefined (value_range_t *vr) - { - vr-type = VR_UNDEFINED; - vr-min = vr-max = NULL_TREE; - if (vr-equiv) - bitmap_clear (vr-equiv); - } - - /* If abs (min) abs (max), set VR to [-max, max], if abs (min) = abs (max), set VR to [-min, min]. */ --- 640,645
[PATCH][4/n] VRP and anti-range handling
This tries to completely implement the intersect primitive for VRP (what extract_range_from_assert does at its end when merging new and old knowledge). Bootstrap and regtest pending on x86_64-unknown-linux-gnu. I plan to re-organize vrp_meet in a similar fashion as a followup. Richard. 2012-06-15 Richard Guenther rguent...@suse.de * tree-vrp.c (extract_range_from_assert): Split out range intersecting code. (intersect_ranges): New function. (vrp_intersect_ranges): Likewise. Index: trunk/gcc/tree-vrp.c === *** trunk.orig/gcc/tree-vrp.c 2012-06-15 14:12:44.0 +0200 --- trunk/gcc/tree-vrp.c2012-06-15 14:33:42.861821583 +0200 *** live_on_edge (edge e, tree name) *** 95,100 --- 95,101 static int compare_values (tree val1, tree val2); static int compare_values_warnv (tree val1, tree val2, bool *); static void vrp_meet (value_range_t *, value_range_t *); + static void vrp_intersect_ranges (value_range_t *, value_range_t *); static tree vrp_evaluate_conditional_warnv_with_ops (enum tree_code, tree, tree, bool, bool *, bool *); *** static void *** 1515,1521 extract_range_from_assert (value_range_t *vr_p, tree expr) { tree var, cond, limit, min, max, type; ! value_range_t *var_vr, *limit_vr; enum tree_code cond_code; var = ASSERT_EXPR_VAR (expr); --- 1516,1522 extract_range_from_assert (value_range_t *vr_p, tree expr) { tree var, cond, limit, min, max, type; ! value_range_t *limit_vr; enum tree_code cond_code; var = ASSERT_EXPR_VAR (expr); *** extract_range_from_assert (value_range_t *** 1777,2014 else gcc_unreachable (); ! /* If VAR already had a known range, it may happen that the new ! range we have computed and VAR's range are not compatible. For ! instance, ! ! if (p_5 == NULL) ! p_6 = ASSERT_EXPR p_5, p_5 == NULL; ! x_7 = p_6-fld; ! p_8 = ASSERT_EXPR p_6, p_6 != NULL; ! ! While the above comes from a faulty program, it will cause an ICE ! later because p_8 and p_6 will have incompatible ranges and at ! the same time will be considered equivalent. A similar situation ! would arise from ! ! if (i_5 10) ! i_6 = ASSERT_EXPR i_5, i_5 10; ! if (i_5 5) ! i_7 = ASSERT_EXPR i_6, i_6 5; ! ! Again i_6 and i_7 will have incompatible ranges. It would be ! pointless to try and do anything with i_7's range because ! anything dominated by 'if (i_5 5)' will be optimized away. ! Note, due to the wa in which simulation proceeds, the statement ! i_7 = ASSERT_EXPR ... we would never be visited because the ! conditional 'if (i_5 5)' always evaluates to false. However, ! this extra check does not hurt and may protect against future ! changes to VRP that may get into a situation similar to the ! NULL pointer dereference example. ! ! Note that these compatibility tests are only needed when dealing ! with ranges or a mix of range and anti-range. If VAR_VR and VR_P ! are both anti-ranges, they will always be compatible, because two ! anti-ranges will always have a non-empty intersection. */ ! ! var_vr = get_value_range (var); ! ! /* We may need to make adjustments when VR_P and VAR_VR are numeric ! ranges or anti-ranges. */ ! if (vr_p-type == VR_VARYING ! || vr_p-type == VR_UNDEFINED ! || var_vr-type == VR_VARYING ! || var_vr-type == VR_UNDEFINED ! || symbolic_range_p (vr_p) ! || symbolic_range_p (var_vr)) ! return; ! ! if (var_vr-type == VR_RANGE vr_p-type == VR_RANGE) ! { ! /* If the two ranges have a non-empty intersection, we can !refine the resulting range. Since the assert expression !creates an equivalency and at the same time it asserts a !predicate, we can take the intersection of the two ranges to !get better precision. */ ! if (value_ranges_intersect_p (var_vr, vr_p)) ! { ! /* Use the larger of the two minimums. */ ! if (compare_values (vr_p-min, var_vr-min) == -1) ! min = var_vr-min; ! else ! min = vr_p-min; ! ! /* Use the smaller of the two maximums. */ ! if (compare_values (vr_p-max, var_vr-max) == 1) ! max = var_vr-max; ! else ! max = vr_p-max; ! ! set_value_range (vr_p, vr_p-type, min, max, vr_p-equiv); ! } ! else ! { ! /* The two ranges do not intersect, set the new range to !VARYING, because we will not be able to do anything !meaningful with it. */ ! set_value_range_to_varying (vr_p); ! } ! } ! else if ((var_vr-type ==
[PATCH] Fix PR tree-optimization/53636 (SLP generates invalid misaligned access)
Hello, PR tree-optimization/53636 is a crash due to an invalid unaligned access generated by the vectorizer. The problem is that vect_compute_data_ref_alignment uses DR_ALIGNED_TO as computed by the default data-ref analysis to decide whether an access is sufficiently aligned for the vectorizer. However, this analysis computes the scalar evolution relative to the innermost loop in which the access takes place; DR_ALIGNED_TO only reflects the known alignmnent of the *base* address according to that evolution. Now, if we're actually about to vectorize this particular loop, then just checking the alignment of the base is exactly what we need to do (subsequent accesses will usually be misaligned, but that's OK since we're transforming those into a vector access). However, if we're actually currently vectorizing something else, this test is not sufficient. The code currently already checks for the case where we're performing outer-loop vectorization. In this case, we need to check alignment of the access on *every* pass through the inner loop, as the comment states: /* In case the dataref is in an inner-loop of the loop that is being vectorized (LOOP), we use the base and misalignment information relative to the outer-loop (LOOP). This is ok only if the misalignment stays the same throughout the execution of the inner-loop, which is why we have to check that the stride of the dataref in the inner-loop evenly divides by the vector size. */ However, there is a second case where we need to check every pass: if we're not actually vectorizing any loop, but are performing basic-block SLP. In this case, it would appear that we need the same check as described in the comment above, i.e. to verify that the stride is a multiple of the vector size. The patch below adds this check, and this indeed fixes the invalid access I was seeing in the test case (in the final assembler, we now get a vld1.16 instead of vldr). Tested on arm-linux-gnueabi with no regressions. OK for mainline? Bye, Ulrich ChangeLog: gcc/ PR tree-optimization/53636 * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Verify stride when doing basic-block vectorization. gcc/testsuite/ PR tree-optimization/53636 * gcc.target/arm/pr53636.c: New test. === added file 'gcc/testsuite/gcc.target/arm/pr53636.c' --- gcc/testsuite/gcc.target/arm/pr53636.c 1970-01-01 00:00:00 + +++ gcc/testsuite/gcc.target/arm/pr53636.c 2012-06-11 17:31:41 + @@ -0,0 +1,48 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_hw } */ +/* { dg-options -O -ftree-vectorize } */ +/* { dg-add-options arm_neon } */ + +void fill (short *buf) __attribute__ ((noinline)); +void fill (short *buf) +{ + int i; + + for (i = 0; i 11 * 8; i++) +buf[i] = i; +} + +void test (unsigned char *dst) __attribute__ ((noinline)); +void test (unsigned char *dst) +{ + short tmp[11 * 8], *tptr; + int i; + + fill (tmp); + + tptr = tmp; + for (i = 0; i 8; i++) +{ + dst[0] = (-tptr[0] + 9 * tptr[0 + 1] + 9 * tptr[0 + 2] - tptr[0 + 3]) 7; + dst[1] = (-tptr[1] + 9 * tptr[1 + 1] + 9 * tptr[1 + 2] - tptr[1 + 3]) 7; + dst[2] = (-tptr[2] + 9 * tptr[2 + 1] + 9 * tptr[2 + 2] - tptr[2 + 3]) 7; + dst[3] = (-tptr[3] + 9 * tptr[3 + 1] + 9 * tptr[3 + 2] - tptr[3 + 3]) 7; + dst[4] = (-tptr[4] + 9 * tptr[4 + 1] + 9 * tptr[4 + 2] - tptr[4 + 3]) 7; + dst[5] = (-tptr[5] + 9 * tptr[5 + 1] + 9 * tptr[5 + 2] - tptr[5 + 3]) 7; + dst[6] = (-tptr[6] + 9 * tptr[6 + 1] + 9 * tptr[6 + 2] - tptr[6 + 3]) 7; + dst[7] = (-tptr[7] + 9 * tptr[7 + 1] + 9 * tptr[7 + 2] - tptr[7 + 3]) 7; + + dst += 8; + tptr += 11; +} +} + +int main (void) +{ + char buf [8 * 8]; + + test (buf); + + return 0; +} + === modified file 'gcc/tree-vect-data-refs.c' --- gcc/tree-vect-data-refs.c 2012-05-31 08:46:10 + +++ gcc/tree-vect-data-refs.c 2012-06-11 17:31:41 + @@ -845,6 +845,24 @@ } } + /* Similarly, if we're doing basic-block vectorization, we can only use + base and misalignment information relative to an innermost loop if the + misalignment stays the same throughout the execution of the loop. + As above, this is the case if the stride of the dataref evenly divides + by the vector size. */ + if (!loop) +{ + tree step = DR_STEP (dr); + HOST_WIDE_INT dr_step = TREE_INT_CST_LOW (step); + + if (dr_step % GET_MODE_SIZE (TYPE_MODE (vectype)) != 0) + { + if (vect_print_dump_info (REPORT_ALIGNMENT)) + fprintf (vect_dump, SLP: step doesn't divide the vector-size.); + misalign = NULL_TREE; + } +} + base = build_fold_indirect_ref (base_addr); alignment = ssize_int (TYPE_ALIGN (vectype)/BITS_PER_UNIT); -- Dr. Ulrich Weigand GNU Toolchain for Linux on System z and Cell BE ulrich.weig...@de.ibm.com
Re: [PATCH] Fix PR tree-optimization/53636 (SLP generates invalid misaligned access)
On Fri, Jun 15, 2012 at 3:13 PM, Ulrich Weigand uweig...@de.ibm.com wrote: Hello, PR tree-optimization/53636 is a crash due to an invalid unaligned access generated by the vectorizer. The problem is that vect_compute_data_ref_alignment uses DR_ALIGNED_TO as computed by the default data-ref analysis to decide whether an access is sufficiently aligned for the vectorizer. However, this analysis computes the scalar evolution relative to the innermost loop in which the access takes place; DR_ALIGNED_TO only reflects the known alignmnent of the *base* address according to that evolution. Now, if we're actually about to vectorize this particular loop, then just checking the alignment of the base is exactly what we need to do (subsequent accesses will usually be misaligned, but that's OK since we're transforming those into a vector access). However, if we're actually currently vectorizing something else, this test is not sufficient. The code currently already checks for the case where we're performing outer-loop vectorization. In this case, we need to check alignment of the access on *every* pass through the inner loop, as the comment states: /* In case the dataref is in an inner-loop of the loop that is being vectorized (LOOP), we use the base and misalignment information relative to the outer-loop (LOOP). This is ok only if the misalignment stays the same throughout the execution of the inner-loop, which is why we have to check that the stride of the dataref in the inner-loop evenly divides by the vector size. */ However, there is a second case where we need to check every pass: if we're not actually vectorizing any loop, but are performing basic-block SLP. In this case, it would appear that we need the same check as described in the comment above, i.e. to verify that the stride is a multiple of the vector size. The patch below adds this check, and this indeed fixes the invalid access I was seeing in the test case (in the final assembler, we now get a vld1.16 instead of vldr). Tested on arm-linux-gnueabi with no regressions. OK for mainline? Ok. Thanks, Richard. Bye, Ulrich ChangeLog: gcc/ PR tree-optimization/53636 * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Verify stride when doing basic-block vectorization. gcc/testsuite/ PR tree-optimization/53636 * gcc.target/arm/pr53636.c: New test. === added file 'gcc/testsuite/gcc.target/arm/pr53636.c' --- gcc/testsuite/gcc.target/arm/pr53636.c 1970-01-01 00:00:00 + +++ gcc/testsuite/gcc.target/arm/pr53636.c 2012-06-11 17:31:41 + @@ -0,0 +1,48 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_hw } */ +/* { dg-options -O -ftree-vectorize } */ +/* { dg-add-options arm_neon } */ + +void fill (short *buf) __attribute__ ((noinline)); +void fill (short *buf) +{ + int i; + + for (i = 0; i 11 * 8; i++) + buf[i] = i; +} + +void test (unsigned char *dst) __attribute__ ((noinline)); +void test (unsigned char *dst) +{ + short tmp[11 * 8], *tptr; + int i; + + fill (tmp); + + tptr = tmp; + for (i = 0; i 8; i++) + { + dst[0] = (-tptr[0] + 9 * tptr[0 + 1] + 9 * tptr[0 + 2] - tptr[0 + 3]) 7; + dst[1] = (-tptr[1] + 9 * tptr[1 + 1] + 9 * tptr[1 + 2] - tptr[1 + 3]) 7; + dst[2] = (-tptr[2] + 9 * tptr[2 + 1] + 9 * tptr[2 + 2] - tptr[2 + 3]) 7; + dst[3] = (-tptr[3] + 9 * tptr[3 + 1] + 9 * tptr[3 + 2] - tptr[3 + 3]) 7; + dst[4] = (-tptr[4] + 9 * tptr[4 + 1] + 9 * tptr[4 + 2] - tptr[4 + 3]) 7; + dst[5] = (-tptr[5] + 9 * tptr[5 + 1] + 9 * tptr[5 + 2] - tptr[5 + 3]) 7; + dst[6] = (-tptr[6] + 9 * tptr[6 + 1] + 9 * tptr[6 + 2] - tptr[6 + 3]) 7; + dst[7] = (-tptr[7] + 9 * tptr[7 + 1] + 9 * tptr[7 + 2] - tptr[7 + 3]) 7; + + dst += 8; + tptr += 11; + } +} + +int main (void) +{ + char buf [8 * 8]; + + test (buf); + + return 0; +} + === modified file 'gcc/tree-vect-data-refs.c' --- gcc/tree-vect-data-refs.c 2012-05-31 08:46:10 + +++ gcc/tree-vect-data-refs.c 2012-06-11 17:31:41 + @@ -845,6 +845,24 @@ } } + /* Similarly, if we're doing basic-block vectorization, we can only use + base and misalignment information relative to an innermost loop if the + misalignment stays the same throughout the execution of the loop. + As above, this is the case if the stride of the dataref evenly divides + by the vector size. */ + if (!loop) + { + tree step = DR_STEP (dr); + HOST_WIDE_INT dr_step = TREE_INT_CST_LOW (step); + + if (dr_step % GET_MODE_SIZE (TYPE_MODE (vectype)) != 0) + { + if (vect_print_dump_info (REPORT_ALIGNMENT)) + fprintf (vect_dump, SLP: step doesn't divide the vector-size.); + misalign = NULL_TREE; + } + } + base = build_fold_indirect_ref
Re: [Patch 4.6] In system.h, wrap include of C++ header in 'extern C++'
On Fri, Jun 15, 2012 at 2:40 PM, Duncan Sands baldr...@free.fr wrote: Hi Richard, As system.h is supposed to only include system headers and do nothing else it has to be prepared to be included from C++ already, so no extern C wrapping should be necessary for it. it defines fancy_abort. Not wrapping system.h in extern C results in undefined symbol: _Z11fancy_abortPKciS0_ when loading the plugin. Hmm, it looks like it does not need to define that though. fancy_abort should have been prototyped somewhere else while the macros can continue to stay in system.h (though I fail to see why we'd have the asserts here). Of course we have two implementations of it, one in errors.c and one in diagnostics.c. Richard. Ciao, Duncan.
[Patch, libgcc] Fix build warnings in fixed-bit.c
Hi, When building for, say, mips-linux-gnu, the build of objects from fixed-bit.c produces a lot of set but not used warnings for min_high min_low. looking at the code, these appear to be genuine. Fixed as below. OK for trunk? Iain libgcc: * fixed-bit.c (SATFRACT): Adjust declarations and init for min_high, min_low. Index: libgcc/fixed-bit.c === --- libgcc/fixed-bit.c (revision 188657) +++ libgcc/fixed-bit.c (working copy) @@ -768,11 +768,12 @@ SATFRACT (FROM_FIXED_C_TYPE a) #if FROM_MODE_UNSIGNED == 0 BIG_SINT_C_TYPE high, low; BIG_SINT_C_TYPE max_high, max_low; +# if TO_MODE_UNSIGNED == 0 BIG_SINT_C_TYPE min_high, min_low; +# endif #else BIG_UINT_C_TYPE high, low; BIG_UINT_C_TYPE max_high, max_low; - BIG_UINT_C_TYPE min_high, min_low; #endif #if TO_FBITS FROM_FBITS BIG_UINT_C_TYPE utemp; @@ -819,13 +820,12 @@ SATFRACT (FROM_FIXED_C_TYPE a) #endif #if TO_MODE_UNSIGNED == 0 - min_high = -1; stemp = (BIG_SINT_C_TYPE)1 (BIG_WIDTH - 1); stemp = stemp (BIG_WIDTH - 1 - TO_I_F_BITS); +# if FROM_MODE_UNSIGNED == 0 + min_high = -1; min_low = stemp; -#else - min_high = 0; - min_low = 0; +# endif #endif #if FROM_MODE_UNSIGNED == 0 TO_MODE_UNSIGNED == 0 @@ -973,9 +973,9 @@ SATFRACT (FROM_INT_C_TYPE a) FROM_INT_C_TYPE x = a; BIG_SINT_C_TYPE high, low; BIG_SINT_C_TYPE max_high, max_low; - BIG_SINT_C_TYPE min_high, min_low; #if TO_MODE_UNSIGNED == 0 BIG_SINT_C_TYPE stemp; + BIG_SINT_C_TYPE min_low, min_high; #endif #if BIG_WIDTH != TO_FBITS BIG_UINT_C_TYPE utemp; @@ -1011,13 +1011,10 @@ SATFRACT (FROM_INT_C_TYPE a) #endif #if TO_MODE_UNSIGNED == 0 - min_high = -1; stemp = (BIG_SINT_C_TYPE)1 (BIG_WIDTH - 1); stemp = stemp (BIG_WIDTH - 1 - TO_I_F_BITS); min_low = stemp; -#else - min_high = 0; - min_low = 0; + min_high = -1; #endif #if TO_MODE_UNSIGNED == 0
[arm] Remove obsolete FPA support (5/n): Clean up predicates and constraints
This patch cleans up the predicates and constraints that are now redundant after the removal of FPA and Maverick co-processor support. Tested on arm-eabi. * arm.md (addsf3, adddf3): Use s_register_operand. (subsf3, subdf3): Likewise. (mulsf3, muldf3): Likewise. (difsf3, divdf3): Likewise. (movsfcc, movdfcc): Likewise. * predicates.md (f_register_operand): Delete. (arm_float_rhs_operand): Delete. (arm_float_add_operand): Delete. (arm_float_compare_operand): Use s_register_operand when there's no VFP. (cirrus_register_operand): Delete. (cirrus_fp_register): Delete. (cirrus_shift_const): Delete. (cmpdi_operand): Remove Maverick support. * constraints.md (f, v, H): Delete constraints. (G): Update documentation. * arm.c (fp_consts_inited): Convert to bool. (strings_fp): Delete. (values_fp): Delete. (value_fp0): New variable. (init_fp_table): Simplify logic. (arm_const_double_rtx): Likewise. (fp_immediate_constant): Likewise. (fp_const_from_val): Likewise. (neg_const_double_rtx_ok_for_fpa): Delete. * doc/md.texi (ARM constraints): Update documentation. R.Index: doc/md.texi === --- doc/md.texi (revision 188621) +++ doc/md.texi (working copy) @@ -1653,21 +1653,13 @@ table heading for each architecture is t the meanings of that architecture's constraints. @table @emph -@item ARM family---@file{config/arm/arm.h} +@item ARM family---@file{config/arm/constraints.md} @table @code -@item f -Floating-point register - @item w VFP floating-point register -@item F -One of the floating-point constants 0.0, 0.5, 1.0, 2.0, 3.0, 4.0, 5.0 -or 10.0 - @item G -Floating-point constant that would satisfy the constraint @samp{F} if it -were negated +The floating-point constant 0.0 @item I Integer that is valid as an immediate operand in a data processing Index: config/arm/arm.c === --- config/arm/arm.c(revision 188622) +++ config/arm/arm.c(working copy) @@ -8676,33 +8676,18 @@ arm_cortex_a5_branch_cost (bool speed_p, return speed_p ? 0 : arm_default_branch_cost (speed_p, predictable_p); } -static int fp_consts_inited = 0; +static bool fp_consts_inited = false; -/* Only zero is valid for VFP. Other values are also valid for FPA. */ -static const char * const strings_fp[8] = -{ - 0, 1, 2, 3, - 4, 5, 0.5, 10 -}; - -static REAL_VALUE_TYPE values_fp[8]; +static REAL_VALUE_TYPE value_fp0; static void init_fp_table (void) { - int i; REAL_VALUE_TYPE r; - if (TARGET_VFP) -fp_consts_inited = 1; - else -fp_consts_inited = 8; - - for (i = 0; i fp_consts_inited; i++) -{ - r = REAL_VALUE_ATOF (strings_fp[i], DFmode); - values_fp[i] = r; -} + r = REAL_VALUE_ATOF (0, DFmode); + value_fp0 = r; + fp_consts_inited = true; } /* Return TRUE if rtx X is a valid immediate FP constant. */ @@ -8719,36 +8704,12 @@ arm_const_double_rtx (rtx x) if (REAL_VALUE_MINUS_ZERO (r)) return 0; - for (i = 0; i fp_consts_inited; i++) -if (REAL_VALUES_EQUAL (r, values_fp[i])) - return 1; - - return 0; -} - -/* Return TRUE if rtx X is a valid immediate FPA constant. */ -int -neg_const_double_rtx_ok_for_fpa (rtx x) -{ - REAL_VALUE_TYPE r; - int i; - - if (!fp_consts_inited) -init_fp_table (); - - REAL_VALUE_FROM_CONST_DOUBLE (r, x); - r = real_value_negate (r); - if (REAL_VALUE_MINUS_ZERO (r)) -return 0; - - for (i = 0; i 8; i++) -if (REAL_VALUES_EQUAL (r, values_fp[i])) - return 1; + if (REAL_VALUES_EQUAL (r, value_fp0)) +return 1; return 0; } - /* VFPv3 has a fairly wide range of representable immediates, formed from quarter-precision floating-point values. These can be evaluated using this formula (with ^ for exponentiation): @@ -13715,11 +13676,9 @@ fp_immediate_constant (rtx x) init_fp_table (); REAL_VALUE_FROM_CONST_DOUBLE (r, x); - for (i = 0; i 8; i++) -if (REAL_VALUES_EQUAL (r, values_fp[i])) - return strings_fp[i]; - gcc_unreachable (); + gcc_assert (REAL_VALUES_EQUAL (r, value_fp0)); + return 0; } /* As for fp_immediate_constant, but value is passed directly, not in rtx. */ @@ -13731,11 +13690,8 @@ fp_const_from_val (REAL_VALUE_TYPE *r) if (!fp_consts_inited) init_fp_table (); - for (i = 0; i 8; i++) -if (REAL_VALUES_EQUAL (*r, values_fp[i])) - return strings_fp[i]; - - gcc_unreachable (); + gcc_assert (REAL_VALUES_EQUAL (*r, value_fp0)); + return 0; } /* Output the operands of a LDM/STM instruction to STREAM. Index: config/arm/constraints.md === --- config/arm/constraints.md (revision 188621) +++ config/arm/constraints.md (working copy) @@
Re: [Patch 4.6] In system.h, wrap include of C++ header in 'extern C++'
On Fri, Jun 15, 2012 at 4:48 AM, Richard Guenther richard.guent...@gmail.com wrote: On Fri, Jun 15, 2012 at 11:27 AM, Duncan Sands baldr...@free.fr wrote: My plugin is written in C++. When including headers from gcc-4.6 it wraps them in 'extern C' to prevent name mangling. Some of the plugin headers include gcc/system.h which includes the C++ header cstring if it detects the use of a C++ compiler. As a result cstring routines included this way end up wrapped in 'extern C', while those included directly from C++ aren't 'extern C'. This doesn't worry g++, but clang gets upset, erroring out with a complaint about multiple inconsistent declarations of memchr and friends. Is the following patch OK to apply to gcc-4.6? And is it in principle OK to apply to gcc-4.7 (I didn't test it there yet)? It would be useful if gcc-4.7 is compiled as C. Uh, I don't think we should do that. Why do we include cstring here anyways? Agreed. Including a standard header within a language specification is bogus. If string.h is needed, just include it as such. A standard header isn't under our control (whether C or C++ is used is immaterial); the whole justification of don't want name mangling is bogus. If a standard header isn't needed; it should not be included. -- Gaby
Re: [Patch 4.6] In system.h, wrap include of C++ header in 'extern C++'
On Fri, Jun 15, 2012 at 4:59 AM, Duncan Sands baldr...@free.fr wrote: Hi Richard, Uh, I don't think we should do that. Why do we include cstring here anyways? Ian - you added this include in rev. 167764, I don't think that was proper. But I'm not sure wrapping a system.h include inside extern C from a C++ plugin is proper either ... since the plugin needs to call GCC routines, and GCC is built as C, it has to wrap at least some GCC headers in extern C to avoid mangling of the names of those GCC routines (otherwise you can't load the plugin because the linker will look for the mangled names in GCC and not find them). But perhaps you know a trick to avoid the name mangling problem? It is true that maybe via a careful dance it is possible to not wrap system.h in extern C - I will give it a go. Including the whole gcc/system.h in an extern C is not good. But you can put the declarations in gcc/system.h that you want to have C language spec in such block.
Re: [Patch 4.6] In system.h, wrap include of C++ header in 'extern C++'
On Fri, Jun 15, 2012 at 7:40 AM, Duncan Sands baldr...@free.fr wrote: Hi Richard, As system.h is supposed to only include system headers and do nothing else it has to be prepared to be included from C++ already, so no extern C wrapping should be necessary for it. it defines fancy_abort. Not wrapping system.h in extern C results in undefined symbol: _Z11fancy_abortPKciS0_ when loading the plugin. If you want fancy_abort to have a C language specification, that is what you should declare as such. Ciao, Duncan.
Re: [Patch 4.6] In system.h, wrap include of C++ header in 'extern C++'
On Fri, Jun 15, 2012 at 8:25 AM, Richard Guenther richard.guent...@gmail.com wrote: On Fri, Jun 15, 2012 at 2:40 PM, Duncan Sands baldr...@free.fr wrote: Hi Richard, As system.h is supposed to only include system headers and do nothing else it has to be prepared to be included from C++ already, so no extern C wrapping should be necessary for it. it defines fancy_abort. Not wrapping system.h in extern C results in undefined symbol: _Z11fancy_abortPKciS0_ when loading the plugin. Hmm, it looks like it does not need to define that though. fancy_abort should have been prototyped somewhere else while the macros can continue to stay in system.h (though I fail to see why we'd have the asserts here). Of course we have two implementations of it, one in errors.c and one in diagnostics.c. The one in diagnostics.c is what we generally use in almost all front-ends. The one in errors.c is used for just the simple-minded gcc driver, therefore it has to be simple. Maybe the fix is to move that declararation elsewhere?
Re: [Patch 4.6] In system.h, wrap include of C++ header in 'extern C++'
On Fri, Jun 15, 2012 at 9:18 AM, Duncan Sands baldr...@free.fr wrote: Hi Gabriel, it defines fancy_abort. Not wrapping system.h in extern C results in undefined symbol: _Z11fancy_abortPKciS0_ when loading the plugin. If you want fancy_abort to have a C language specification, that is what you should declare as such. my code isn't using fancy_abort directly, it is including GCC headers that use gcc_assert (which turns into fancy_abort). Ciao, Duncan. Richard just reminded me that we have two fancy_aborts. Could you tell which one your code is indirectly using?
Re: [Patch 4.6] In system.h, wrap include of C++ header in 'extern C++'
Hi Gabriel, Richard just reminded me that we have two fancy_aborts. Could you tell which one your code is indirectly using? the one installed as plugin/include/system.h, which seems to be gcc/include/system.h. It is used for example in tree.h here: /* Advance to the next argument. */ static inline void function_args_iter_next (function_args_iterator *i) { gcc_assert (i-next != NULL_TREE); i-next = TREE_CHAIN (i-next); } Best wishes, Duncan.
Re: RF[CA]: Don't restrict stack slot sharing
Hi, On Wed, 6 Jun 2012, Richard Guenther wrote: Regstrapped this patch (all languages+Ada) on x86_64-linux, with and without the above scheduler hacks, no regressions (without the scheduler hacks). The n_temp_slots_in_use part is ok. The rest is also a good idea, and indeed the middle-end type-based memory-model makes sharing slots always possible (modulo bugs, like the nonoverlapping_component_refs_p code - which should simply be removed). Thus, ok for the rest, too, after waiting a while for others to chime in. Now finally in as r188667. Ciao, Michael.
Re: [PATCH] Fix PR tree-optimization/53636 (SLP generates invalid misaligned access)
Richard Guenther wrote: On Fri, Jun 15, 2012 at 3:13 PM, Ulrich Weigand uweig...@de.ibm.com wrote: However, there is a second case where we need to check every pass: if we're not actually vectorizing any loop, but are performing basic-block SLP. In this case, it would appear that we need the same check as described in the comment above, i.e. to verify that the stride is a multiple of the vector size. The patch below adds this check, and this indeed fixes the invalid access I was seeing in the test case (in the final assembler, we now get a vld1.16 instead of vldr). Tested on arm-linux-gnueabi with no regressions. OK for mainline? Ok. Thanks for the quick review; I've checked this in to mainline now. I just noticed that the test case also crashes on 4.7, but not on 4.6. Would a backport to 4.7 also be OK, once testing passes? Thanks, Ulrich -- Dr. Ulrich Weigand GNU Toolchain for Linux on System z and Cell BE ulrich.weig...@de.ibm.com
Re: [Patch 4.6] In system.h, wrap include of C++ header in 'extern C++'
On Fri, Jun 15, 2012 at 9:33 AM, Duncan Sands baldr...@free.fr wrote: Hi Gabriel, Richard just reminded me that we have two fancy_aborts. Could you tell which one your code is indirectly using? the one installed as plugin/include/system.h, which seems to be gcc/include/system.h. OK. I think that declaration has to have the C language spec. Would you prepare a patch for that? It is used for example in tree.h here: /* Advance to the next argument. */ static inline void function_args_iter_next (function_args_iterator *i) { gcc_assert (i-next != NULL_TREE); i-next = TREE_CHAIN (i-next); } Best wishes, Duncan.
Re: [Patch 4.6] In system.h, wrap include of C++ header in 'extern C++'
Hi Gabriel, Richard just reminded me that we have two fancy_aborts. Could you tell which one your code is indirectly using? the one installed as plugin/include/system.h, which seems to be gcc/include/system.h. OK. I think that declaration has to have the C language spec. Would you prepare a patch for that? you mean: wrap the fancy_abort declaration in system.h in 'extern C'? Sure, I will prepare a patch. Best wishes, Duncan. It is used for example in tree.h here: /* Advance to the next argument. */ static inline void function_args_iter_next (function_args_iterator *i) { gcc_assert (i-next != NULL_TREE); i-next = TREE_CHAIN (i-next); } Best wishes, Duncan.
Re: [Patch 4.6] In system.h, wrap include of C++ header in 'extern C++'
On Fri, Jun 15, 2012 at 10:13 AM, Duncan Sands baldr...@free.fr wrote: Hi Gabriel, Richard just reminded me that we have two fancy_aborts. Could you tell which one your code is indirectly using? the one installed as plugin/include/system.h, which seems to be gcc/include/system.h. OK. I think that declaration has to have the C language spec. Would you prepare a patch for that? you mean: wrap the fancy_abort declaration in system.h in 'extern C'? Yes. Thanks. Sure, I will prepare a patch.
[PATCH][Cilkplus] Fix bug for the -lto flag
Hello Everyone, This patch is for the Cilkplus branch affecting both C and C++ compilers. This patch will fix the ICE when the user uses the -flto flag. Thanks, Balaji V. Iyer. Index: gcc/tree.h === --- gcc/tree.h (revision 188669) +++ gcc/tree.h (working copy) @@ -342,6 +342,7 @@ NOT_BUILT_IN = 0, BUILT_IN_FRONTEND, BUILT_IN_MD, + BUILT_IN_CILK, BUILT_IN_NORMAL }; @@ -3680,7 +3681,7 @@ ??? The bitfield needs to be able to hold all target function codes as well. */ ENUM_BITFIELD(built_in_function) function_code : 11; - ENUM_BITFIELD(built_in_class) built_in_class : 2; + ENUM_BITFIELD(built_in_class) built_in_class : 3; unsigned static_ctor_flag : 1; unsigned static_dtor_flag : 1; Index: gcc/builtins.def === --- gcc/builtins.def(revision 188669) +++ gcc/builtins.def(working copy) @@ -149,6 +149,11 @@ DEF_BUILTIN (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE, TYPE,\ true, true, true, ATTRS, false, flag_tm) +#undef DEF_CILK_BUILTIN_STUB +#define DEF_CILK_BUILTIN_STUB(ENUM, NAME) \ + DEF_BUILTIN (ENUM, NAME, BUILT_IN_CILK, BT_LAST, BT_LAST, false, false, \ + false, ATTR_LAST, false, false) + /* Define an attribute list for math functions that are normally impure because some of them may write into global memory for `errno'. If !flag_errno_math they are instead const. */ @@ -802,38 +807,39 @@ DEF_BUILTIN_STUB (BUILT_IN_EH_COPY_VALUES, __builtin_eh_copy_values) /* Cilk */ -DEF_BUILTIN_STUB(BUILT_IN_CILK_WORKER_ID, __cilkrts_current_worker_id) -DEF_BUILTIN_STUB(BUILT_IN_CILK_WORKER_PTR, __cilkrts_current_worker) -DEF_BUILTIN_STUB(BUILT_IN_CILK_DETACH, __cilkrts_detach) -DEF_BUILTIN_STUB(BUILT_IN_CILK_SYNCHED, __cilkrts_synched) -DEF_BUILTIN_STUB(BUILT_IN_CILK_STOLEN, __cilkrts_was_stolen) -DEF_BUILTIN_STUB(BUILT_IN_CILK_ENTER_FRAME, __cilkrts_enter_frame) -DEF_BUILTIN_STUB(BUILT_IN_CILK_POP_FRAME, __cilkrts_pop_frame) -DEF_BUILTIN_STUB (BUILT_IN_CILK_ENTER_BEGIN, __cilk_enter_begin) -DEF_BUILTIN_STUB (BUILT_IN_CILK_ENTER_H_BEGIN, __cilk_enter_helper_begin) -DEF_BUILTIN_STUB (BUILT_IN_CILK_ENTER_END, __cilk_enter_end) -DEF_BUILTIN_STUB (BUILT_IN_CILK_SPAWN_PREPARE, __cilk_spawn_prepare) -DEF_BUILTIN_STUB (BUILT_IN_SPAWN_OR_CONT, __cilk_spawn_or_continue) -DEF_BUILTIN_STUB (BUILT_IN_CILK_DETACH_BEGIN, __cilk_detach_begin) -DEF_BUILTIN_STUB (BUILT_IN_CILK_DETACH_END, __cilk_detach_end) -DEF_BUILTIN_STUB (BUILT_IN_CILK_SYNC_BEGIN, __cilk_sync_begin) -DEF_BUILTIN_STUB (BUILT_IN_CILK_SYNC_END, __cilk_sync_end) -DEF_BUILTIN_STUB (BUILT_IN_CILK_LEAVE_BEGIN, __cilk_leave_begin) -DEF_BUILTIN_STUB (BUILT_IN_CILK_LEAVE_END, __cilk_leave_end) -DEF_BUILTIN_STUB (BUILT_IN_CILKSCREEN_METACALL, cilkscreen_metacall) -DEF_BUILTIN_STUB (BUILT_IN_CILK_RESUME, cilk_resume) -DEF_BUILTIN_STUB (BUILT_IN_LEAVE_STOLEN, cilk_leave_stolen) -DEF_BUILTIN_STUB (BUILT_IN_SYNC_ABANDON, cilk_sync_abandon) -DEF_BUILTIN_STUB (BUILT_IN_CILKSCREEN_EN_INSTR, +DEF_CILK_BUILTIN_STUB(BUILT_IN_CILK_WORKER_ID, __cilkrts_current_worker_id) +DEF_CILK_BUILTIN_STUB(BUILT_IN_CILK_WORKER_PTR, __cilkrts_current_worker) +DEF_CILK_BUILTIN_STUB(BUILT_IN_CILK_DETACH, __cilkrts_detach) +DEF_CILK_BUILTIN_STUB(BUILT_IN_CILK_SYNCHED, __cilkrts_synched) +DEF_CILK_BUILTIN_STUB(BUILT_IN_CILK_STOLEN, __cilkrts_was_stolen) +DEF_CILK_BUILTIN_STUB(BUILT_IN_CILK_ENTER_FRAME, __cilkrts_enter_frame) +DEF_CILK_BUILTIN_STUB(BUILT_IN_CILK_LEAVE_FRAME, __cilkrts_leave_frame) +DEF_CILK_BUILTIN_STUB(BUILT_IN_CILK_POP_FRAME, __cilkrts_pop_frame) +DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_ENTER_BEGIN, __cilk_enter_begin) +DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_ENTER_H_BEGIN, __cilk_enter_helper_begin) +DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_ENTER_END, __cilk_enter_end) +DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_SPAWN_PREPARE, __cilk_spawn_prepare) +DEF_CILK_BUILTIN_STUB (BUILT_IN_SPAWN_OR_CONT, __cilk_spawn_or_continue) +DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_DETACH_BEGIN, __cilk_detach_begin) +DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_DETACH_END, __cilk_detach_end) +DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_SYNC_BEGIN, __cilk_sync_begin) +DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_SYNC_END, __cilk_sync_end) +DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_LEAVE_BEGIN, __cilk_leave_begin) +DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_LEAVE_END, __cilk_leave_end) +DEF_CILK_BUILTIN_STUB (BUILT_IN_CILKSCREEN_METACALL, cilkscreen_metacall) +DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_RESUME, cilk_resume) +DEF_CILK_BUILTIN_STUB (BUILT_IN_LEAVE_STOLEN, cilk_leave_stolen) +DEF_CILK_BUILTIN_STUB (BUILT_IN_SYNC_ABANDON, cilk_sync_abandon) +DEF_CILK_BUILTIN_STUB (BUILT_IN_CILKSCREEN_EN_INSTR, cilkscreen_enable_instrumentation) -DEF_BUILTIN_STUB (BUILT_IN_CILKSCREEN_DS_INSTR, +DEF_CILK_BUILTIN_STUB (BUILT_IN_CILKSCREEN_DS_INSTR, cilkscreen_disable_instrumentation) -DEF_BUILTIN_STUB
Re: [Patch 4.6] In system.h, wrap include of C++ header in 'extern C++'
Richard Guenther richard.guent...@gmail.com writes: Ian - you added this include in rev. 167764, I don't think that was proper. But I'm not sure wrapping a system.h include inside extern C from a C++ plugin is proper either ... I did commit 167764 but I didn't write it. It's from http://gcc.gnu.org/ml/gcc-patches/2010-11/msg02567.html http://gcc.gnu.org/PR46650 The patch is there because system.h poisons strerror. Clearly we have to #include string.h before poisoning strerror. And we do. But when we #include C++ headers, some of the C++ headers #include cstring. So system.h needs to do that also. I think there is no question that as long as system.h poisons strerror, we need to arrange to #include both string.h and cstring before that poisoning, and that the natural way to ensure that is to #include both in system.h. And that is what we do today. I don't really know what the right solution is here, because I don't know how we feel about wrapping #include system.h in extern C. A simple workaround is to #include cstring before the #include system.h. Or the OP's patch using extern C++ is a simple workaround within system.h. Or maybe we simply drop the poison of strerror, and then system.h doesn't need to #include cstring anyhow. Ian
Re: [RFC C++] Turn on builtin_shuffle for C++.
On 15 June 2012 01:44, Jason Merrill ja...@redhat.com wrote: OK. Thanks, now committed with the only change being that the PR number is now referenced in the Changelog. Ramana
Re: [patch] Fix PR middle-end/53590
Btw, I think we should enable this flag by default for all languages but Java so that if you enable -fnon-call-exceptions for C or C++ you don't get too many spurious exceptions from dead code. The attached patch enables it for the C family of languages (I'm not too sure about the other languages). It also adds the missing bits related to inlining (with the annoying FIXME for LTO in can_inline_edge_p). Bootstrapped/regtested on x86_64-suse-linux, OK for mainline? 2012-06-15 Eric Botcazou ebotca...@adacore.com PR middle-end/53590 * doc/invoke.texi (-fdelete-dead-exceptions): Update. * cif-code.def (DEAD_EXCEPTIONS): New code. * ipa-inline.c (can_inline_edge_p): Return false if the caller can delete dead exceptions but the callee cannot. * tree-inline.c (initialize_cfun): Copy can_delete_dead_exceptions. c-family/ * c-opts.c (c_common_init_options_struct): Set opts-x_flag_delete_dead_exceptions to 1. -- Eric Botcazou Index: doc/invoke.texi === --- doc/invoke.texi (revision 188667) +++ doc/invoke.texi (working copy) @@ -19322,7 +19322,9 @@ arbitrary signal handlers such as @code{ Consider that instructions that may throw exceptions but don't otherwise contribute to the execution of the program can be optimized away. This option is enabled by default for the Ada front end, as permitted by -the Ada language specification. +the Ada language specification. It is also enabled for the front ends of +the C family of languages, as it applies only to non-call exceptions and +those are not part of the the language specifications. Optimization passes that cause dead exceptions to be removed are enabled independently at different optimization levels. @item -funwind-tables Index: c-family/c-opts.c === --- c-family/c-opts.c (revision 188445) +++ c-family/c-opts.c (working copy) @@ -204,6 +204,11 @@ c_common_init_options_struct (struct gcc /* By default, C99-like requirements for complex multiply and divide. */ opts-x_flag_complex_method = 2; + + /* We can delete dead instructions that may throw exceptions because, in + practice, this occurs only for non-call exceptions and those are not + part of the the language specifications. */ + opts-x_flag_delete_dead_exceptions = 1; } /* Common initialization before calling option handlers. */ Index: cif-code.def === --- cif-code.def (revision 188445) +++ cif-code.def (working copy) @@ -96,7 +96,11 @@ DEFCIFCODE(EH_PERSONALITY, N_(exception /* We can't inline if the callee can throw non-call exceptions but the caller cannot. */ -DEFCIFCODE(NON_CALL_EXCEPTIONS, N_(non-call exception handling mismatch)) +DEFCIFCODE(NON_CALL_EXCEPTIONS, N_(non-call exceptions handling mismatch)) + +/* We can't inline if the caller can delete dead exceptions but the + callee cannot. */ +DEFCIFCODE(DEAD_EXCEPTIONS, N_(dead exceptions handling mismatch)) /* We can't inline because of mismatched target specific options. */ DEFCIFCODE(TARGET_OPTION_MISMATCH, N_(target specific option mismatch)) Index: ipa-inline.c === --- ipa-inline.c (revision 188667) +++ ipa-inline.c (working copy) @@ -302,6 +302,16 @@ can_inline_edge_p (struct cgraph_edge *e e-inline_failed = CIF_NON_CALL_EXCEPTIONS; inlinable = false; } + /* Don't inline if the caller can delete dead exceptions but the + callee cannot. + FIXME: this is obviously wrong for LTO where STRUCT_FUNCTION is missing. + Move the flag into cgraph node or mirror it in the inline summary. */ + else if (caller_cfun caller_cfun-can_delete_dead_exceptions + !(callee_cfun callee_cfun-can_delete_dead_exceptions)) +{ + e-inline_failed = CIF_DEAD_EXCEPTIONS; + inlinable = false; +} /* Check compatibility of target optimization options. */ else if (!targetm.target_option.can_inline_p (e-caller-symbol.decl, callee-symbol.decl)) Index: tree-inline.c === --- tree-inline.c (revision 188668) +++ tree-inline.c (working copy) @@ -2107,6 +2107,7 @@ initialize_cfun (tree new_fndecl, tree c cfun-after_inlining = src_cfun-after_inlining; cfun-can_throw_non_call_exceptions = src_cfun-can_throw_non_call_exceptions; + cfun-can_delete_dead_exceptions = src_cfun-can_delete_dead_exceptions; cfun-returns_struct = src_cfun-returns_struct; cfun-returns_pcc_struct = src_cfun-returns_pcc_struct; cfun-after_tree_profile = src_cfun-after_tree_profile;
Re: [RFC C++] Turn on builtin_shuffle for C++.
On Thu, 14 Jun 2012, Ramana Radhakrishnan wrote: While experimenting with the fixes to allow neon intrinsics to work with __builtin_shuffle I hit the fact that __builtin_shuffle isn't really supported by the C++ frontend.I'm keen we use __builtin_shuffle for these intrinsics, but that means we need this support in the C++ frontend. I've taken the liberty of pulling Marc's patch from bugzilla, adding the couple of bits and pieces that were needed, moved all the vshuf* tests from gcc.c-torture/execute to c-c++-common/torture which means they run for both the C and C++ compilers, and bootstrapped and regtested this on x86_64, gcc110(powerpc*-linux) and arm-linux-gnueabi (with a cross compiler). I've then verified that all the tests pass and there are no regressions for these targets Any other place I should be moving these tests to ? Ok ? regards, Ramana 2012-06-14 Marc Glisse marc.gli...@inria.fr * c-typeck.c (c_build_vec_perm_expr): Move to c-family/c-common.c. * c-tree.h (c_build_vec_perm_expr): Move to c-family/c-common.h. c-family/ * c-typeck.c (c_build_vec_perm_expr): Move to c-family/c-common.c. * c-tree.h (c_build_vec_perm_expr): Move to c-family/c-common.h. cp/ * semantics.c (literal_type_p): Handle VECTOR_TYPE. (potential_constant_expression_1): Handle VEC_PERM_EXPR. I just noticed this part. Rereading my comment in http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51033#c22 it seems like this may break things with -std=c++11 and you used the old version from comment 19. I am unable to test anything these days (taking a plane tomorrow), sorry. Thanks again for taking charge of the patch, -- Marc Glisse
Re: [RFC C++] Turn on builtin_shuffle for C++.
I just noticed this part. Rereading my comment in http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51033#c22 I haven't been able to make it break with -std=c++11 . Is there something I'm missing here ? it seems like this may break things with -std=c++11 and you used the old version from comment 19. I am unable to test anything these days (taking a plane tomorrow), sorry. ./g++ -B`pwd` /home/ramrad01/cross-build/fsf/src/gcc-rewrite-permute-intrinsics/gcc/testsuite/c-c++-common/torture/vshuf-v8si.c -std=c++11 -S -O2 appears to work just fine. Am I missing something here ? A number of tests that pass vector constants to __builtin_shuffle appear to be working ok . Can you point out what testcase and how it is broken with std=c++11. I remember trying some simple tests with that and that appeared to work. One thing I do notice now is that all the c-c++-common tests are possibly not running with -std=c++11 . However they appear to be compiling ok ? My motivation in this stems from being able to rewrite the Neon permute intrinsics in C and C++ with __builtin_shuffle. Ramana Thanks again for taking charge of the patch, -- Marc Glisse
Re: long long availability in host compiler (Re: constant that doesn't fit in 32bits in alpha.c)
On Jun 15, 2012, at 2:22 AM, Pedro Alves pal...@redhat.com wrote: It's not about example, but the fact that host compilers have been compiling that code as part of building gcc for years, without anyone complaining Yeah, I think we should just jump to c++ 11 and not look back... Fighting against using a 10 year old language standard I think is silly; and I like have the old obsolete ports in gcc.
Re: [RFC C++] Turn on builtin_shuffle for C++.
On Fri, 15 Jun 2012, Ramana Radhakrishnan wrote: I just noticed this part. Rereading my comment in http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51033#c22 I haven't been able to make it break with -std=c++11 . Is there something I'm missing here ? I don't remember. It might just be that trying to create a constexpr vector variable or calling __builtin_shuffle on it ICEs instead of giving an error. I can keep a note to make some tests at the end of July (I will be mostly away until then), but I believe the code from comment 22 is safer than the one from comment 20, if memory serves. -- Marc Glisse
Re: [PATCH, TileGX] Committed fix for a typo bug
Thanks. I found another one and I fixed it. 2012-06-15 Walter Lee w...@tilera.com * config/tilegx/sync.md (atomic_fetch_fetchop_namemode): Fix typo. Index: config/tilegx/sync.md === --- config/tilegx/sync.md (revision 188672) +++ config/tilegx/sync.md (working copy) @@ -121,7 +121,7 @@ emit_insn (gen_atomic_fetch_fetchop_name_baremode (operands[0], operands[1], operands[2])); - tilegx_pre_atomic_barrier (model); + tilegx_post_atomic_barrier (model); DONE; }) On 6/14/2012 6:46 PM, Maxim Kuvyrkov wrote: Walter, While working on atomics for a different target, I've noticed below typo bug in TileGX. Patch checked in as obvious. Thank you, -- Maxim Kuvyrkov CodeSourcery / Mentor Graphics
Re: divide 64-bit by constant for 32-bit target machines
Hi, Richard, How about if I add and utilize umul_highpart_di to the libgcc instead of expanding multiplication for the high part directly, or add my own function with with pre-shift, post-shift, and 64-bit constant and 64-bit operand as function parameters for division for less than -O3? thanks, Dinar. On Fri, Jun 15, 2012 at 12:12 PM, Richard Earnshaw rearn...@arm.com wrote: On 14/06/12 19:46, Dinar Temirbulatov wrote: Hi, OK for trunk? thanks, Dinar. I'm still not comfortable about the code bloat that this is likely to incurr at -O2. R. On Tue, Jun 12, 2012 at 11:00 AM, Paolo Bonzini bonz...@gnu.org wrote: Il 12/06/2012 08:52, Dinar Temirbulatov ha scritto: is safe? That is, that the underflows cannot produce a wrong result? [snip] Thanks very much! Paolo= ChangeLog.txt 2012-06-14 Dinar Temirbulatov dtemirbula...@gmail.com Alexey Kravets mr.kayr...@gmail.com Paolo Bonzini bonz...@gnu.org * config/arm/arm.c (arm_rtx_costs_1): Add cost estimate for the integer double-word division operation. * config/mips/mips.c (mips_rtx_costs): Extend cost estimate for the integer double-word division operation for 32-bit targets. * gcc/expmed.c (expand_mult_highpart_optab): Allow to generate the higher multipilcation product for unsigned double-word integers using 32-bit wide registers. 30.patch N ¬n‡r¥ªíÂ)emçhÂyhi× ¢w^™©Ý
Re: [RFC C++] Turn on builtin_shuffle for C++.
On 15 June 2012 18:18, Marc Glisse marc.gli...@inria.fr wrote: On Fri, 15 Jun 2012, Ramana Radhakrishnan wrote: I just noticed this part. Rereading my comment in http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51033#c22 I haven't been able to make it break with -std=c++11 . Is there something I'm missing here ? I don't remember. It might just be that trying to create a constexpr vector variable or calling __builtin_shuffle on it ICEs instead of giving an error. I can keep a note to make some tests at the end of July (I will be mostly away until then), but I believe the code from comment 22 is safer than the one from comment 20, if memory serves. I'm not qualified enough to take a call on what's better in this case and will have to defer to Jason and the C++ maintainers on this one. Now that you've said this I decided to go back and throw more tests through it I've tried to chug through most of the testcases for __builtin_shuffle including a few of my own the simplest of which I show below trying to trigger this issue but can't seem to do so. typedef int v4si __attribute__ ((vector_size (16))); v4si c; const v4si d = (v4si) { 10, 11, 23, 33}; v4si vs (v4si a, v4si b) { c = __builtin_shuffle (a, b, (v4si){0, 4, 1, 5}); return a; } Ofcourse it is not complicated C++ in any which way but the frontend ends up generating something like the following (void) (c = VEC_PERM_EXPR a , b , TARGET_EXPR D.5209, {0, 4, 1, 5} ) ; rather than anything else but I could be missing something fundamental here if that's not what you expect the C++ frontend to be doing. regards, Ramana -- Marc Glisse
RE: long long availability in host compiler (Re: constant that doesn't fit in 32bits in alpha.c)
CC: ebotcazou gcc-patches gingold rth joseph jay.krell From: mikestump To: palves On Jun 15, 2012, at 2:22 AM, Pedro Alves pal...@redhat.com wrote: It's not about example, but the fact that host compilers have been compiling that code as part of building gcc for years, without anyone complaining Yeah, I think we should just jump to c++ 11 and not look back... Fighting against using a 10 year old language standard I think is silly; and I like have the old obsolete ports in gcc. 64bit integer might not be called long long, it could be long or __int64, size_t/ptrdiff_t, etc.. I do find gcc's portability impressive, and one might suggest multiple precision arithmetic, a pair of longs, but indeed compilers lacking some 64bit integer by some name are rare, and one could always bootstrap via older gcc or take advantage of biarch/multiarch and first build native 32bit and then native 64bit with the native 32bit gcc as the bootstrap compiler. (I relatively recently bootstrapped hppa-hpux-gcc-4.x via KR cc via gcc 3.x (3.3?). Obviously it is more time and work, but it does work, and frees mainline gcc from caring.) Heck, one could even automate this like how there is a multi-pass bootstrap, adding earlier stages that go via e.g. gcc 3.3. The earlier compiler stages could be stripped down, e.g. no optimizer, no debug info output, no LTO. - Jay
[testsuite] g++.dg, g++.old-deja: unique lines for messages in test summary
This test modifies dg-message, dg-warning, and dg-error test directives for several G++ tests by adding comments that will be added to lines in the test summary to eliminate non-unique lines for checks of messages for the same line of source code in a test. Tested on i686-pc-linux-gnu and arm-none-eabi, checked in. Janis 2012-06-15 Janis Johnson jano...@codesourcery.com * g++.dg/cpp0x/auto27.C: Add comments to checks for multiple messages reported for one line of source code. * g++.dg/cpp0x/constexpr-decl.C: Likewise. * g++.dg/cpp0x/decltype2.C: Likewise. * g++.dg/cpp0x/decltype3.C: Likewise. * g++.dg/cpp0x/lambda/lambda-syntax1.C: Likewise. * g++.dg/cpp0x/regress/error-recovery1.C: Likewise. * g++.dg/cpp0x/static_assert3.C: Likewise. * g++.dg/cpp0x/udlit-cpp98-neg.C: Likewise. * g++.dg/cpp0x/udlit-shadow-neg.C: Likewise. * g++.dg/cpp0x/union1.C: Likewise. * g++.dg/cpp0x/variadic-ex10.C: Likewise. * g++.dg/cpp0x/variadic-ex14.C: Likewise. * g++.dg/cpp0x/variadic2.C: Likewise. * g++.dg/cpp0x/variadic20.C: Likewise. * g++.dg/cpp0x/variadic74.C: Likewise. * g++.dg/diagnostic/bitfld2.C: Likewise. * g++.dg/ext/attrib44.C: Likewise. * g++.dg/ext/no-asm-1.C: Likewise. * g++.dg/other/error34.C: Likewise. * g++.dg/parse/crash46.C: Likewise. * g++.dg/parse/error10.C: Likewise. * g++.dg/parse/error2.C: Likewise. * g++.dg/parse/error3.C: Likewise. * g++.dg/parse/error36.C: Likewise. * g++.dg/parse/error8.C: Likewise. * g++.dg/parse/error9.C: Likewise. * g++.dg/parse/parser-pr28152-2.C: Likewise. * g++.dg/parse/parser-pr28152.C: Likewise. * g++.dg/parse/template25.C: Likewise. * g++.dg/parse/typename11.C: Likewise. * g++.dg/tc1/dr147.C: Likewise. * g++.dg/template/deduce3.C: Likewise. * g++.dg/template/koenig9.C: Likewise. * g++.dg/template/pr23510.C: Likewise. * g++.dg/warn/pr12242.C: Likewise. * g++.dg/warn/pr30551-2.C: Likewise. * g++.dg/warn/pr30551.C: Likewise. * g++.old-deja/g++.other/typename1.C: Likewise. * g++.old-deja/g++.pt/niklas01a.C: Likewise. Index: g++.dg/cpp0x/auto27.C === --- g++.dg/cpp0x/auto27.C (revision 188540) +++ g++.dg/cpp0x/auto27.C (working copy) @@ -1,6 +1,6 @@ // PR c++/51186 -auto main()-int // { dg-error std= { target c++98 } } - // { dg-error auto { target c++98 } 3 } - // { dg-error no type { target c++98 } 3 } +auto main()-int // { dg-error std= std { target c++98 } } + // { dg-error auto auto { target c++98 } 3 } + // { dg-error no type no type { target c++98 } 3 } { } Index: g++.dg/cpp0x/constexpr-decl.C === --- g++.dg/cpp0x/constexpr-decl.C (revision 188540) +++ g++.dg/cpp0x/constexpr-decl.C (working copy) @@ -2,8 +2,8 @@ // { dg-options -std=c++0x } struct S { - static constexpr int size; // { dg-error must have an initializer } - // { dg-error previous declaration { target *-*-* } 5 } + static constexpr int size; // { dg-error must have an initializer must have } + // { dg-error previous declaration previous { target *-*-* } 5 } }; const int limit = 2 * S::size; Index: g++.dg/cpp0x/decltype2.C === --- g++.dg/cpp0x/decltype2.C(revision 188540) +++ g++.dg/cpp0x/decltype2.C(working copy) @@ -45,8 +45,8 @@ int bar(int); CHECK_DECLTYPE(decltype(foo), int(char)); -decltype(bar) z; // { dg-error overload } -// { dg-error invalid type { target *-*-* } 48 } +decltype(bar) z; // { dg-error overload overload } +// { dg-error invalid type invalid { target *-*-* } 48 } CHECK_DECLTYPE(decltype(foo), int(*)(char)); CHECK_DECLTYPE(decltype(*foo), int()(char)); Index: g++.dg/cpp0x/decltype3.C === --- g++.dg/cpp0x/decltype3.C(revision 188540) +++ g++.dg/cpp0x/decltype3.C(working copy) @@ -55,8 +55,8 @@ }; CHECK_DECLTYPE(decltype(aa.*A::a), int); -decltype(aa.*A::b) zz; // { dg-error cannot create pointer to reference member } -// { dg-error invalid type { target *-*-* } 58 } +decltype(aa.*A::b) zz; // { dg-error cannot create pointer to reference member cannot } +// { dg-error invalid type invalid type { target *-*-* } 58 } CHECK_DECLTYPE(decltype(caa.*A::a), const int); class X { Index: g++.dg/cpp0x/lambda/lambda-syntax1.C === --- g++.dg/cpp0x/lambda/lambda-syntax1.C(revision 188540) +++ g++.dg/cpp0x/lambda/lambda-syntax1.C
[testsuite] g++.dg/torture/stackalign: make compile lines unique in test summary
Like the C stackalign tests, the tests in g++.dg/torture/stackalign use two sets of torture options: the usual optimization sets used as default for torture tests, and sets of options that are specific to stack alignment. There are fewer stack alignment options used for the G++ tests but otherwise the setup is the same. This patch is similar to the one for the C stackalign tests and uses existing support to combine the torture options and stackalign options so they are all reported in the pass/fail lines in the test summary to make each line unique. Since this isn't significantly different from the patch for C tests I'm not waiting for a review. Tested on i686-pc-linux-gnu and arm-none-eabi, checked in. Janis 2012-06-15 Janis Johnson jani...@codesourcery.com * g++.dg/torture/stackalign/stackalign.exp: Combine stack alignment torture options with usual torture options. Index: g++.dg/torture/stackalign/stackalign.exp === --- g++.dg/torture/stackalign/stackalign.exp(revision 188540) +++ g++.dg/torture/stackalign/stackalign.exp(working copy) @@ -1,4 +1,4 @@ -# Copyright (C) 2008, 2010 +# Copyright (C) 2008, 2010, 2012 # Free Software Foundation, Inc. # This program is free software; you can redistribute it and/or modify @@ -18,18 +18,41 @@ # This harness is for tests that should be run at all optimisation levels. load_lib g++-dg.exp +load_lib torture-options.exp + +global DG_TORTURE_OPTIONS LTO_TORTURE_OPTIONS + dg-init -set additional_flags +torture-init +# default_flags are replaced by a dg-options test directive, or appended +# to by using dg-additional-options. Use default_flags for options that +# are used in all of the torture sets to limit the amount of noise in +# test summaries. +set default_flags + +# torture_flags are combined with other torture options and do not +# affect options specified within a test. +set torture_flags + +set stackalign_options [list] + # If automatic stack alignment is supported, force it on. if { [check_effective_target_automatic_stack_alignment] } then { -lappend additional_flags -mstackrealign -lappend additional_flags -mpreferred-stack-boundary=5 +append default_flags -mstackrealign +append default_flags -mpreferred-stack-boundary=5 } +lappend stackalign_options [join $torture_flags] -gcc-dg-runtest [lsort [glob $srcdir/$subdir/*.C]] $additional_flags if { [check_effective_target_fpic] } then { -lappend additional_flags -fpic -gcc-dg-runtest [lsort [glob $srcdir/$subdir/*.C]] $additional_flags +lappend torture_flags -fpic +lappend stackalign_options [join $torture_flags] } + +# Combine stackalign options with the usual torture optimization flags. +set-torture-options [concat $DG_TORTURE_OPTIONS $LTO_TORTURE_OPTIONS] $stackalign_options + +gcc-dg-runtest [lsort [glob $srcdir/$subdir/*.C]] $default_flags + +torture-finish dg-finish
Re: long long availability in host compiler (Re: constant that doesn't fit in 32bits in alpha.c)
Eric == Eric Botcazou ebotca...@adacore.com writes: Pedro It's not about example, but the fact that host compilers have been Pedro compiling that code as part of building gcc for years, without anyone Pedro complaining, afaik. It doesn't matter whether the code pointed at Pedro is the ugliest or most beautiful code on earth. What matters is whether Pedro it uses long long unconditionally on all hosts or not. Pedro IOW, what are the still supported hosts/compilers that don't Pedro support long long? If there are any, it appears none has been used Pedro in at least the past 5 years, IIU the code correctly. Eric OK, but GCC still officially requires only an ISO C90 compiler Eric http://gcc.gnu.org/install/prerequisites.html Eric so the usage of 'long long' in libdecnumber is a bug that could be Eric fixed at some point. That's why using it as a precedent isn't the Eric best thing to do. It's true that this is a pedantic violation; but the point here is that there is no practical barrier to using 'long long'. This code has been in the tree since 2007; so if there is some issue with it, it ought to have surfaced by now. Tom
[testsuite] gcov.exp: include flags in test summary lines
GCOV tests for C++ are run for both std=gnu++98 and std=gnu++11. Those options are not reported by GCOV-specific lines in the test summary, leading to non-unique lines. This patch modifies the GCOV test support to use a testname that includes the extra flags used for a set of tests and also modifies the format of summary lines to better incorporate that information. For example, these lines: PASS: gcc.misc-tests/gcov-1.c:17 line count PASS: gcc.misc-tests/gcov-1.c gcov PASS: g++.dg/gcov/gcov-1.C:279 line count PASS: g++.dg/gcov/gcov-1.C gcov are now: PASS: gcc.misc-tests/gcov-1.c count for line 17 PASS: gcc.misc-tests/gcov-1.c gcov PASS: g++.dg/gcov/gcov-1.C -std=gnu++98 count for line 279 PASS: g++.dg/gcov/gcov-1.C -std=gnu++98 gcov PASS: g++.dg/gcov/gcov-1.C -std=gnu++11 count for line 279 PASS: g++.dg/gcov/gcov-1.C -std=gnu++11 gcov Tested on i686-pc-linux-gnu and arm-eabi for gcc and g++ GCOV tests. OK for mainline? Janis 2012-06-15 Janis Johnson jani...@codesourcery.com * lib/gcov.exp (verify-lines, verify-branches, verify-calls): Use testname that includes flags, passed in as new argument, in pass/fail messages. (run_gcov): Get testname from dg-test, use it in pass/fail messages and pass it to verify-* procedures. Index: lib/gcov.exp === --- lib/gcov.exp(revision 188622) +++ lib/gcov.exp(working copy) @@ -34,12 +34,14 @@ # # verify-lines -- check that line counts are as expected # -# TESTCASE is the name of the test. +# TESTNAME is the name of the test, including unique flags. +# TESTCASE is the name of the test file. # FILE is the name of the gcov output file. # -proc verify-lines { testcase file } { +proc verify-lines { testname testcase file } { #send_user verify-lines\n global subdir + set failed 0 set fd [open $file r] while { [gets $fd line] = 0 } { @@ -54,13 +56,13 @@ } } if { $is == } { - fail $subdir/$testcase:$n:no data available for this line + fail $testname line $n: no data available incr failed } elseif { $is != $shouldbe } { - fail $subdir/$testcase:$n:is $is:should be $shouldbe + fail $testname line $n: is $is:should be $shouldbe incr failed } else { - pass $subdir/$testcase:$n line count + pass $testname count for line $n } } } @@ -71,7 +73,8 @@ # # verify-branches -- check that branch percentages are as expected # -# TESTCASE is the name of the test. +# TESTNAME is the name of the test, including unique flags. +# TESTCASE is the name of the test file. # FILE is the name of the gcov output file. # # Checks are based on comments in the source file. This means to look for @@ -86,8 +89,9 @@ # branch instructions. Don't check for branches that might be # optimized away or replaced with predicated instructions. # -proc verify-branches { testcase file } { +proc verify-branches { testname testcase file } { #send_user verify-branches\n + set failed 0 set shouldbe set fd [open $file r] @@ -99,7 +103,7 @@ if [regexp branch\\((\[0-9 \]+)\\) $line all new_shouldbe] { # All percentages in the current list should have been seen. if {[llength $shouldbe] != 0} { - fail $n: expected branch percentages not found: $shouldbe + fail $testname line $n: expected branch percentages not found: $shouldbe incr failed set shouldbe } @@ -117,14 +121,14 @@ } elseif [regexp branch +\[0-9\]+ taken (-\[0-9\]+)% $line \ all taken] { # Percentages should never be negative. - fail $n: negative percentage: $taken + fail $testname line $n: negative percentage: $taken incr failed } elseif [regexp branch +\[0-9\]+ taken (\[0-9\]+)% $line \ all taken] { #send_user $n: taken = $taken\n # Percentages should never be greater than 100. if {$taken 100} { - fail $n: percentage greater than 100: $taken + fail $testname line $n: branch percentage greater than 100: $taken incr failed } if {$taken 50} { @@ -139,7 +143,7 @@ } elseif [regexp branch\\(end\\) $line] { # All percentages in the list should have been seen by now. if {[llength $shouldbe] != 0} { - fail $n: expected branch percentages not found: $shouldbe + fail $testname line n: expected branch percentages not found: $shouldbe incr failed }
Re: [PATCH][2/n] alias.c TLC
On Mon, Jun 11, 2012 at 7:32 AM, Richard Guenther rguent...@suse.de wrote: This makes ao_ref_from_mem less conservative if either MEM_OFFSET or MEM_SIZE is not set. From other alias.c code and set_mem_attributes_minus_bitpos one has to conclude that MEM_EXPR is always conservatively correct (it only can cover a larger memory area) and the MEM_OFFSET / MEM_SIZE pair can refine it. Thus, we can make ao_ref_from_mem less conservative when, for example, faced with variable array accesses which set_mem_attributes_minus_bitpos ends up representing with an unknown MEM_OFFSET. Bootstrap and regtest running on x86_64-unknown-linux-gnu. Richard. 2012-06-11 Richard Guenther rguent...@suse.de * emit-rtl.c (set_mem_attributes_minus_bitpos): Remove dead code. * alias.c (ao_ref_from_mem): MEM_EXPR is conservative, MEM_OFFSET and MEM_SIZE only refines it. Reflect that and be less conservative if either of the latter is not known. This caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53688 -- H.J.
Re: [testsuite] g++.dg/torture/stackalign: make compile lines unique in test summary
On Fri, Jun 15, 2012 at 11:06 AM, Janis Johnson janis_john...@mentor.com wrote: Like the C stackalign tests, the tests in g++.dg/torture/stackalign use two sets of torture options: the usual optimization sets used as default for torture tests, and sets of options that are specific to stack alignment. There are fewer stack alignment options used for the G++ tests but otherwise the setup is the same. This patch is similar to the one for the C stackalign tests and uses existing support to combine the torture options and stackalign options so they are all reported in the pass/fail lines in the test summary to make each line unique. Since this isn't significantly different from the patch for C tests I'm not waiting for a review. Tested on i686-pc-linux-gnu and arm-none-eabi, checked in. Thanks. -- H.J.
[google/main] Revert preliminary Fission patches (issue6310047)
This patch is for google/main. Revert Fission patches r182490, r182891, r183042, and r183320. These will be going into trunk, and will eventually merge back into google/main from there. r182490: gcc/c-family/ 2011-12-19 Sterling Augustine saugust...@google.com * c-pretty-print.c (pp_c_specifier_qualifier_list): Move conditional from beginning to end. gcc/cp/ 2011-12-19 Sterling Augustine saugust...@google.com * error.c (dump_decl): Reformat return value to (anonymous namespace). (lang_decl_name): Return (anonymous namespace) when appropriate. gcc/ 2011-12-19 Sterling Augustine saugust...@google.com * dwarf2out.c (DEBUG_PUBNAMES_SECTION_LABEL, DEBUG_PUBTYPES_SECTION_LABEL): Define. (debug_pubnames_section_label, debug_pubtypes_section_label): Declare. (is_namespace_die, is_class_die): New functions. (add_enumerator_pubname): New function. (add_pubname): Call is_namespace_die, is_cu_die, and is_class_die in conditional. (add_pubtype): Call is_namespace_die. Rework name calculation. Call type_tag, lang_hooks.dwarf_name and add_enumerator_pubname. (output_pubnames): Output debug_pubnames_section_label or debug_pubtypes_section_label. (base_type_die): Call add_pubtype. (gen_namespace_die): Call add_pubname_string and lang_hooks.dwarf_name. (dwarf2_out_init): Generate debug_pubnames_section_label and debug_pubtypes_section_label. (pubtypes_section_empty): New function. (dwarf2_out_finish): Call add_AT_lineptr if pubnames or pubtypes is non-empty. When dealing with pubnames, change assertion to conditional. Call pubtypes_section_empty. Likewise when dealing with pubtypes. Move code checking for empty section to... (pubtypes_section_empty): Here. * target.def: Switch boolean to enable pubnames and pubtypes. r182891: gcc/ 2012-01-04 Sterling Augustine saugust...@google.com * dwarf2out.c (add_pubname): Move conditional clause from outer to inner if-statement. (dwarf2out_finish): Fix conditions to output DW_AT_GNU_pubnames and DW_AT_GNU_pubtypes. Move decision to output pubnames and pubtypes from here... (output_pubnames): ...to here. (pubtypes_section_empty): Delete unused function. gcc/testsuite/ 2012-01-04 Sterling Augustine saugust...@google.com * g++.dg/diagnostic/bindings1.C: Adjust expected output. * g++.dg/ext/pretty3.C: Likewise. * g++.dg/pr44486.C: Likewise. * g++.dg/warn/Wuninitializable-member.C: Likewise. * g++.dg/warn/pr35711.C: Likewise. * g++.old-deja/g++.pt/memtemp77.C: Likewise. r183042: gcc/ 2012-01-09 Sterling Augustine saugust...@google.com * dwarf2out.c (output_pubnames): Add check for info_section_emitted. r183320: gcc/ 2012-01-19 Sterling Augustine saugust...@google.com * dwarf2out.c (break_out_comdat_types): Add DW_AT_GNU_pubnames and DW_AT_GNU_pubtypes attributes. Index: gcc/c-family/c-pretty-print.c === --- gcc/c-family/c-pretty-print.c (revision 188675) +++ gcc/c-family/c-pretty-print.c (working copy) @@ -446,6 +446,8 @@ pp_c_specifier_qualifier_list (c_pretty_ { const enum tree_code code = TREE_CODE (t); + if (TREE_CODE (t) != POINTER_TYPE) +pp_c_type_qualifier_list (pp, t); switch (code) { case REFERENCE_TYPE: @@ -492,8 +494,6 @@ pp_c_specifier_qualifier_list (c_pretty_ pp_simple_type_specifier (pp, t); break; } - if (TREE_CODE (t) != POINTER_TYPE) -pp_c_type_qualifier_list (pp, t); } /* parameter-type-list: Index: gcc/target.def === --- gcc/target.def (revision 188675) +++ gcc/target.def (working copy) @@ -2813,7 +2813,7 @@ DEFHOOKPOD True if the @code{.debug_pubtypes} and @code{.debug_pubnames} sections\ should be emitted. These sections are not used on most platforms, and\ in particular GDB does not use them., - bool, true) + bool, false) DEFHOOKPOD (delay_sched2, True if sched2 is not to be run at its normal place. \ Index: gcc/testsuite/g++.old-deja/g++.pt/memtemp77.C === --- gcc/testsuite/g++.old-deja/g++.pt/memtemp77.C (revision 188675) +++ gcc/testsuite/g++.old-deja/g++.pt/memtemp77.C (working copy) @@ -19,7 +19,7 @@ const char* S3char::h(int) { return __ int main() { if (strcmp (S3double::h(7), - static char const* S3T::h(U) [with U = int; T = double]) == 0) + static const char* S3T::h(U) [with U = int; T = double]) == 0) return 0; else return 1; Index: gcc/testsuite/g++.dg/ext/pretty3.C
[google/gcc-4_7] Revert preliminary Fission patches (issue6303084)
This patch is for google/gcc-4_7. Revert Fission patches r182490, r182891, r183042, and r183320. This will clear the way to backport the final patches from trunk. r182490: gcc/c-family/ 2011-12-19 Sterling Augustine saugust...@google.com * c-pretty-print.c (pp_c_specifier_qualifier_list): Move conditional from beginning to end. gcc/cp/ 2011-12-19 Sterling Augustine saugust...@google.com * error.c (dump_decl): Reformat return value to (anonymous namespace). (lang_decl_name): Return (anonymous namespace) when appropriate. gcc/ 2011-12-19 Sterling Augustine saugust...@google.com * dwarf2out.c (DEBUG_PUBNAMES_SECTION_LABEL, DEBUG_PUBTYPES_SECTION_LABEL): Define. (debug_pubnames_section_label, debug_pubtypes_section_label): Declare. (is_namespace_die, is_class_die): New functions. (add_enumerator_pubname): New function. (add_pubname): Call is_namespace_die, is_cu_die, and is_class_die in conditional. (add_pubtype): Call is_namespace_die. Rework name calculation. Call type_tag, lang_hooks.dwarf_name and add_enumerator_pubname. (output_pubnames): Output debug_pubnames_section_label or debug_pubtypes_section_label. (base_type_die): Call add_pubtype. (gen_namespace_die): Call add_pubname_string and lang_hooks.dwarf_name. (dwarf2_out_init): Generate debug_pubnames_section_label and debug_pubtypes_section_label. (pubtypes_section_empty): New function. (dwarf2_out_finish): Call add_AT_lineptr if pubnames or pubtypes is non-empty. When dealing with pubnames, change assertion to conditional. Call pubtypes_section_empty. Likewise when dealing with pubtypes. Move code checking for empty section to... (pubtypes_section_empty): Here. * target.def: Switch boolean to enable pubnames and pubtypes. r182891: gcc/ 2012-01-04 Sterling Augustine saugust...@google.com * dwarf2out.c (add_pubname): Move conditional clause from outer to inner if-statement. (dwarf2out_finish): Fix conditions to output DW_AT_GNU_pubnames and DW_AT_GNU_pubtypes. Move decision to output pubnames and pubtypes from here... (output_pubnames): ...to here. (pubtypes_section_empty): Delete unused function. gcc/testsuite/ 2012-01-04 Sterling Augustine saugust...@google.com * g++.dg/diagnostic/bindings1.C: Adjust expected output. * g++.dg/ext/pretty3.C: Likewise. * g++.dg/pr44486.C: Likewise. * g++.dg/warn/Wuninitializable-member.C: Likewise. * g++.dg/warn/pr35711.C: Likewise. * g++.old-deja/g++.pt/memtemp77.C: Likewise. r183042: gcc/ 2012-01-09 Sterling Augustine saugust...@google.com * dwarf2out.c (output_pubnames): Add check for info_section_emitted. r183320: gcc/ 2012-01-19 Sterling Augustine saugust...@google.com * dwarf2out.c (break_out_comdat_types): Add DW_AT_GNU_pubnames and DW_AT_GNU_pubtypes attributes. Index: gcc/c-family/c-pretty-print.c === --- gcc/c-family/c-pretty-print.c (revision 188679) +++ gcc/c-family/c-pretty-print.c (working copy) @@ -446,6 +446,8 @@ pp_c_specifier_qualifier_list (c_pretty_ { const enum tree_code code = TREE_CODE (t); + if (TREE_CODE (t) != POINTER_TYPE) +pp_c_type_qualifier_list (pp, t); switch (code) { case REFERENCE_TYPE: @@ -492,8 +494,6 @@ pp_c_specifier_qualifier_list (c_pretty_ pp_simple_type_specifier (pp, t); break; } - if (TREE_CODE (t) != POINTER_TYPE) -pp_c_type_qualifier_list (pp, t); } /* parameter-type-list: Index: gcc/target.def === --- gcc/target.def (revision 188679) +++ gcc/target.def (working copy) @@ -2813,7 +2813,7 @@ DEFHOOKPOD True if the @code{.debug_pubtypes} and @code{.debug_pubnames} sections\ should be emitted. These sections are not used on most platforms, and\ in particular GDB does not use them., - bool, true) + bool, false) DEFHOOKPOD (delay_sched2, True if sched2 is not to be run at its normal place. \ Index: gcc/testsuite/g++.old-deja/g++.pt/memtemp77.C === --- gcc/testsuite/g++.old-deja/g++.pt/memtemp77.C (revision 188679) +++ gcc/testsuite/g++.old-deja/g++.pt/memtemp77.C (working copy) @@ -19,7 +19,7 @@ const char* S3char::h(int) { return __ int main() { if (strcmp (S3double::h(7), - static char const* S3T::h(U) [with U = int; T = double]) == 0) + static const char* S3T::h(U) [with U = int; T = double]) == 0) return 0; else return 1; Index: gcc/testsuite/g++.dg/ext/pretty3.C
Re: long long availability in host compiler (Re: constant that doesn't fit in 32bits in alpha.c)
It's true that this is a pedantic violation; but the point here is that there is no practical barrier to using 'long long'. This code has been in the tree since 2007; so if there is some issue with it, it ought to have surfaced by now. The whole compiler is written using HOST_WIDE_INT and the like, so using some external code that managed to escape a proper review before being merged in order to justify an incorrect usage is IMO short-sighted, to say the least. -- Eric Botcazou
Re: [RFC C++] Turn on builtin_shuffle for C++.
On Fri, 15 Jun 2012, Ramana Radhakrishnan wrote: On 15 June 2012 18:18, Marc Glisse marc.gli...@inria.fr wrote: On Fri, 15 Jun 2012, Ramana Radhakrishnan wrote: I just noticed this part. Rereading my comment in http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51033#c22 I haven't been able to make it break with -std=c++11 . Is there something I'm missing here ? I don't remember. It might just be that trying to create a constexpr vector variable or calling __builtin_shuffle on it ICEs instead of giving an error. I can keep a note to make some tests at the end of July (I will be mostly away until then), but I believe the code from comment 22 is safer than the one from comment 20, if memory serves. I'm not qualified enough to take a call on what's better in this case and will have to defer to Jason and the C++ maintainers on this one. Now that you've said this I decided to go back and throw more tests through it I've tried to chug through most of the testcases for __builtin_shuffle including a few of my own the simplest of which I show below trying to trigger this issue but can't seem to do so. Maybe something like: #include x86intrin.h int main(){ constexpr __m128d x={1.,2.}; constexpr __m128i y={1,0}; constexpr __m128d z=__builtin_shuffle(x,y); } ? (sorry for the x86 specific code, should be easy to adapt) See also: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53094 Long term, vectors should be literals. But we need something to avoid crashes on operator[] and __builtin_shuffle (ideally implementing the constant version of them). Keeping vectors as non-literals (what I was suggesting) is quite a crude hack. Maybe having them as literals now is a good thing, but it would be good to avoid the ICEs. -- Marc Glisse
Re: [Patch 4.6] In system.h, wrap include of C++ header in 'extern C++'
On Fri, Jun 15, 2012 at 11:06 AM, Ian Lance Taylor i...@google.com wrote: Richard Guenther richard.guent...@gmail.com writes: Ian - you added this include in rev. 167764, I don't think that was proper. But I'm not sure wrapping a system.h include inside extern C from a C++ plugin is proper either ... I did commit 167764 but I didn't write it. It's from http://gcc.gnu.org/ml/gcc-patches/2010-11/msg02567.html http://gcc.gnu.org/PR46650 The patch is there because system.h poisons strerror. Clearly we have to #include string.h before poisoning strerror. And we do. But when we #include C++ headers, some of the C++ headers #include cstring. So system.h needs to do that also. I think there is no question that as long as system.h poisons strerror, we need to arrange to #include both string.h and cstring before that poisoning, and that the natural way to ensure that is to #include both in system.h. And that is what we do today. I don't really know what the right solution is here, because I don't know how we feel about wrapping #include system.h in extern C. A simple workaround is to #include cstring before the #include system.h. Or the OP's patch using extern C++ is a simple workaround within system.h. Or maybe we simply drop the poison of strerror, and then system.h doesn't need to #include cstring anyhow. As long as we don't control string.h and cstring, we can't put them in a language linkage specification. What we can do is what I suggested in my last message: just give the language specification to the declarations that matter in gcc/system.h. -- Gaby
Second ping: Reorganized documentation for warnings
Second ping for patch that reorganized warning documentation http://gcc.gnu.org/ml/gcc-patches/2012-05/msg02024.html http://gcc.gnu.org/ml/gcc-patches/2012-06/msg00423.html
Re: long long availability in host compiler (Re: constant that doesn't fit in 32bits in alpha.c)
Eric == Eric Botcazou ebotca...@adacore.com writes: It's true that this is a pedantic violation; but the point here is that there is no practical barrier to using 'long long'. This code has been in the tree since 2007; so if there is some issue with it, it ought to have surfaced by now. Eric The whole compiler is written using HOST_WIDE_INT and the like, so Eric using some external code that managed to escape a proper review Eric before being merged in order to justify an incorrect usage is IMO Eric short-sighted, to say the least. Not interested in trading barbs about it. Still, I'll find it in me to be partly tongue in cheek. I don't understand what the code being external, or the review, has to do with anything. This code is compiled with the same host compiler as everything else. HOST_WIDE_INT is also not very persuasive to me. We did many things in the past that became obsolete as compilers matured. You can still occasionally find workarounds for old compiler bugs in GNU source; but that doesn't make them relevant. Maybe strict adherence to C90 gives some benefit, but I don't really know what that would be. Of course, I'd rather we -- not GCC obviously, it is going another route, but the rest of the toolchain -- burn some bridges and move to C99. I think we deserve a 13 year old standard. Tom
Re: [Patch 4.6] In system.h, wrap include of C++ header in 'extern C++'
Gabriel Dos Reis g...@integrable-solutions.net writes: What we can do is what I suggested in my last message: just give the language specification to the declarations that matter in gcc/system.h. Sure, just have to check #ifdef ENABLE_BUILD_WITH_CXX to know what specification to give. Ian
Re: [testsuite] gcov.exp: include flags in test summary lines
On Jun 15, 2012, at 11:07 AM, Janis Johnson wrote: GCOV tests for C++ are run for both std=gnu++98 and std=gnu++11. Those options are not reported by GCOV-specific lines in the test summary, OK for mainline? Ok. It is scary that upvar is ever used
[PATCH][Cilkplus]PR 53567
Hello Everyone, This patch is for the Cilkplus branch affecting both C and C++ compilers. The dwarf output function was looking for debugging information for an internally generated spawn helper which is not there. So this patch will make sure that those functions are excluded. Thanks, Balaji V. Iyer.Index: gcc/dwarf2out.c === --- gcc/dwarf2out.c (revision 188679) +++ gcc/dwarf2out.c (working copy) @@ -19548,6 +19548,12 @@ static void dwarf2out_function_decl (tree decl) { + if (flag_enable_cilk decl TREE_CODE (decl) == FUNCTION_DECL) +{ + function *f = DECL_STRUCT_FUNCTION (decl); + if (f f-is_cilk_helper_function) + return; /* can't do debuging output for spawn helper */ +} dwarf2out_decl (decl); call_arg_locations = NULL; call_arg_loc_last = NULL; Index: gcc/ChangeLog.cilk === --- gcc/ChangeLog.cilk (revision 188679) +++ gcc/ChangeLog.cilk (working copy) @@ -1,3 +1,7 @@ +2012-06-15 Balaji V. Iyer balaji.v.i...@intel.com + + * dwarf2out.c (dwarf2out_function_decl): Added a check for spawn helper. + 2012-06-15 Balaji V. Iyer balaji.v.i...@intel.com * cilk.c (install_builtin): Added a check if pushdecl is successful.
Re: [testsuite] gcov.exp: include flags in test summary lines
On 06/15/2012 12:32 PM, Mike Stump wrote: On Jun 15, 2012, at 11:07 AM, Janis Johnson wrote: GCOV tests for C++ are run for both std=gnu++98 and std=gnu++11. Those options are not reported by GCOV-specific lines in the test summary, OK for mainline? Ok. It is scary that upvar is ever used Yes, it is. I've sometimes thought that for testname it might be just a little less scary to use a proc like current_compiler_flags in target-supports-dg.exp, instead of using upvar in every single place we need access to that particular variable from dg-test. Janis
Re: long long availability in host compiler (Re: constant that doesn't fit in 32bits in alpha.c)
On Fri, 15 Jun 2012, Tom Tromey wrote: HOST_WIDE_INT is also not very persuasive to me. We did many things in Although HOST_WIDE_INT is used for too many different things (see Diego's and my architectural goals documents for more discussion, specifically HOST_WIDE_INT, HOST_WIDEST_INT and associated concepts at the bottom of the conventions document), I don't think we should use long long directly in the compiler (except in limited places such as hwint.h selecting a type to use for some abstraction) simply because it's not the right abstraction for saying what the requirements are on the type being used. If the requirement is at least 64 bits, int_fast64_t would be better, for example (gnulib can generate a stdint.h where the host doesn't have it). If it's big enough for the target address space then HOST_WIDE_INT is what we have at present. If it's fast on the host, but size doesn't matter, then HOST_WIDEST_FAST_INT. -- Joseph S. Myers jos...@codesourcery.com
Re: [RFC C++] Turn on builtin_shuffle for C++.
On 15 June 2012 20:04, Marc Glisse marc.gli...@inria.fr wrote: On Fri, 15 Jun 2012, Ramana Radhakrishnan wrote: On 15 June 2012 18:18, Marc Glisse marc.gli...@inria.fr wrote: On Fri, 15 Jun 2012, Ramana Radhakrishnan wrote: I just noticed this part. Rereading my comment in http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51033#c22 I haven't been able to make it break with -std=c++11 . Is there something I'm missing here ? I don't remember. It might just be that trying to create a constexpr vector variable or calling __builtin_shuffle on it ICEs instead of giving an error. I can keep a note to make some tests at the end of July (I will be mostly away until then), but I believe the code from comment 22 is safer than the one from comment 20, if memory serves. I'm not qualified enough to take a call on what's better in this case and will have to defer to Jason and the C++ maintainers on this one. Now that you've said this I decided to go back and throw more tests through it I've tried to chug through most of the testcases for __builtin_shuffle including a few of my own the simplest of which I show below trying to trigger this issue but can't seem to do so. Maybe something like: #include x86intrin.h int main(){ constexpr __m128d x={1.,2.}; constexpr __m128i y={1,0}; constexpr __m128d z=__builtin_shuffle(x,y); } ? (sorry for the x86 specific code, should be easy to adapt) Thanks for the example and your patience - just shows my ignorance with C++11 :( . I'll try to have a look at this later today and see if I can come up with something and possibly integrating that bit of your patch. See also: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53094 Long term, vectors should be literals. But we need something to avoid crashes on operator[] and __builtin_shuffle (ideally implementing the constant version of them). Keeping vectors as non-literals (what I was suggesting) is quite a crude hack. Maybe having them as literals now is a good thing, but it would be good to avoid the ICEs. Agreed that the compiler shouldn't crash in these cases now that I understand finally what you meant. Have a good break. Thanks, Ramana -- Marc Glisse
Re: [Patch 4.6] In system.h, wrap include of C++ header in 'extern C++'
On Fri, Jun 15, 2012 at 2:17 PM, Ian Lance Taylor i...@google.com wrote: Gabriel Dos Reis g...@integrable-solutions.net writes: What we can do is what I suggested in my last message: just give the language specification to the declarations that matter in gcc/system.h. Sure, just have to check #ifdef ENABLE_BUILD_WITH_CXX to know what specification to give. Hmm... could you elaborate on checking for ENABLE_BUILD_WITH_CXX as opposed to just checkin __cpluscplus ? -- Gaby
Re: [Patch 4.6] In system.h, wrap include of C++ header in 'extern C++'
Gabriel Dos Reis g...@integrable-solutions.net writes: On Fri, Jun 15, 2012 at 2:17 PM, Ian Lance Taylor i...@google.com wrote: Gabriel Dos Reis g...@integrable-solutions.net writes: What we can do is what I suggested in my last message: just give the language specification to the declarations that matter in gcc/system.h. Sure, just have to check #ifdef ENABLE_BUILD_WITH_CXX to know what specification to give. Hmm... could you elaborate on checking for ENABLE_BUILD_WITH_CXX as opposed to just checkin __cpluscplus ? If ENABLE_BUILD_WITH_CXX is defined, then GCC itself is built with C++, and we want a C++ signature for functions. If it is not defined, then GCC itself is not built with C++, and we want (and must have) a C signature. I suppose we would decide that fancy_abort always uses a C signature, but that seems odd. Ian
FW: long long availability in host compiler (Re: constant that doesn't fit in 32bits in alpha.c)
[this time as plain text, sorry] Date: Fri, 15 Jun 2012 19:58:23 + From: joseph To: tromey CC: ebotcazou palves gcc-patches gingold rth mikestump Subject: Re: long long availability in host compiler (Re: constant that doesn't fit in 32bits in alpha.c) On Fri, 15 Jun 2012, Tom Tromey wrote: HOST_WIDE_INT is also not very persuasive to me. We did many things in Although HOST_WIDE_INT is used for too many different things (see Diego's and my architectural goals documents for more discussion, specifically HOST_WIDE_INT, HOST_WIDEST_INT and associated concepts at the bottom of the conventions document), I don't think we should use long long directly in the compiler (except in limited places such as hwint.h selecting a type to use for some abstraction) simply because it's not the right abstraction for saying what the requirements are on the type being used. If the requirement is at least 64 bits, int_fast64_t would be better, for example (gnulib can generate a stdint.h where the host doesn't have it). If it's big enough for the target address space then HOST_WIDE_INT is what we have at present. If it's fast on the host, but size doesn't matter, then HOST_WIDEST_FAST_INT. -- Joseph S. Myers joseph@ If it's fast on the host, but size doesn't matter, then HOST_WIDEST_FAST_INT That is int, right? I guess sometimes long, 64bit integer might be faster on 64bit host that has 32bit int?? For a local variables, the size difference rarely amounts to much, I think. For data structures that you have many of, size optimizations become interesting. One can easily dream up many abstractions, too many: can at least hold host pointer can at least hold target pointer can hold the size of a target struct, 32 bits is ok with slightly degraded functionality if 64bits aren't available can be a loop index for a certain smallish constant number of iterations -- e.g. what to use for (pass = 0; pass 2;) or for (i = 0; i sizeof(integer type);) can hold source file size or offset, or seek delta (possibly negative?) can hold host object file file size or offset, or seek delta (possibly negative?) can hold target object file file size or offset, or seek delta (possibly negative?) can hold host executable file file size or offset, or seek delta (possibly negative?) can hold target executable file file size or offset, or seek delta (possibly negative?) can hold host library/archive file file size or offset, or seek delta (possibly negative?) can hold target library/archive file file size or offset, or seek delta (possibly negative?) can hold the number of members in a library/archive, or seek delta (possibly negative?) can hold the number of files in a directory (e.g. for #include search caching) can hold the number of files seen in preprocessor run number of instructions in a function (held in memory or in a file?) number of basic blocks in a function (held in memory or in a file?) number of something that is held in a file (same as file size generally) number of something that is held in memory (size_t) number of cycles measured or estimated number of bytes allocated number of bytes allocated minus number of bytes freed length of an in-memory string (size_t strlen(), but rarely does 32bits not suffice) One can even imagine a 53bit-mantissa double being used...but after some thought in my own code, I'd really rather depend on their being a 64bit integer. It is tempting to throw up one's hands in disgust and just smush all the abstractions down to almost nothing. Otherwise you have to worry about if the types interoperate well, which one is larger/smaller than the other, how do I safely convert? Are their symbols for the min/max of each type? int is always at least 32bits on modern hosts and reasonable if not theoretical max for many things. ditto long, but is really definitely at least 32bits, and often larger similarly HOST_WIDE_INT is pretty fast, maybe slower, often 64bits, and 64bits is usually vastly sufficient for vastly most things..unless manipulating floating point pieces One can check for overflow so that if a 32bit integer proves too small, there is a clear error instead of silent wraparound and crash or bug. One could encode such overflow checks into a C++ integer-like class with operator overloading. It's not that difficult.. Or just provide the stdint.h fast/atleast/exact types and let every section of code make its own typedefs thereof. Establish a naming convention perhaps such that when I see foo_t, I know at a glance that is just some integer type. Maybe by always putting size or count in the name? But some things are pervasive -- host/target address sizes/offsets. I need to go read the document.. - Jay
[PATCH 2/3] Use synth_mult for vector multiplies vs scalar constant
--- gcc/expmed.c | 438 +++- gcc/machmode.h |8 +- 2 files changed, 249 insertions(+), 197 deletions(-) diff --git a/gcc/expmed.c b/gcc/expmed.c index b456bac..16c5c24 100644 --- a/gcc/expmed.c +++ b/gcc/expmed.c @@ -2,7 +2,7 @@ and shifts, multiplies and divides to rtl instructions. Copyright (C) 1987, 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, - 2011 + 2011, 2012 Free Software Foundation, Inc. This file is part of GCC. @@ -93,43 +93,112 @@ static rtx expand_sdiv_pow2 (enum machine_mode, rtx, HOST_WIDE_INT); #define gen_extzv(a,b,c,d) NULL_RTX #endif -void -init_expmed (void) +struct init_expmed_rtl { - struct - { -struct rtx_def reg;rtunion reg_fld[2]; -struct rtx_def plus; rtunion plus_fld1; -struct rtx_def neg; -struct rtx_def mult; rtunion mult_fld1; -struct rtx_def sdiv; rtunion sdiv_fld1; -struct rtx_def udiv; rtunion udiv_fld1; -struct rtx_def zext; -struct rtx_def sdiv_32;rtunion sdiv_32_fld1; -struct rtx_def smod_32;rtunion smod_32_fld1; -struct rtx_def wide_mult; rtunion wide_mult_fld1; -struct rtx_def wide_lshr; rtunion wide_lshr_fld1; -struct rtx_def wide_trunc; -struct rtx_def shift; rtunion shift_fld1; -struct rtx_def shift_mult; rtunion shift_mult_fld1; -struct rtx_def shift_add; rtunion shift_add_fld1; -struct rtx_def shift_sub0; rtunion shift_sub0_fld1; -struct rtx_def shift_sub1; rtunion shift_sub1_fld1; - } all; + struct rtx_def reg; rtunion reg_fld[2]; + struct rtx_def plus; rtunion plus_fld1; + struct rtx_def neg; + struct rtx_def mult; rtunion mult_fld1; + struct rtx_def sdiv; rtunion sdiv_fld1; + struct rtx_def udiv; rtunion udiv_fld1; + struct rtx_def zext; + struct rtx_def sdiv_32; rtunion sdiv_32_fld1; + struct rtx_def smod_32; rtunion smod_32_fld1; + struct rtx_def wide_mult;rtunion wide_mult_fld1; + struct rtx_def wide_lshr;rtunion wide_lshr_fld1; + struct rtx_def wide_trunc; + struct rtx_def shift;rtunion shift_fld1; + struct rtx_def shift_mult; rtunion shift_mult_fld1; + struct rtx_def shift_add;rtunion shift_add_fld1; + struct rtx_def shift_sub0; rtunion shift_sub0_fld1; + struct rtx_def shift_sub1; rtunion shift_sub1_fld1; rtx pow2[MAX_BITS_PER_WORD]; rtx cint[MAX_BITS_PER_WORD]; - int m, n; - enum machine_mode mode, wider_mode; - int speed; +}; + +static void +init_expmed_one_mode (struct init_expmed_rtl *all, + enum machine_mode mode, int speed) +{ + int m, n, mode_bitsize; + + mode_bitsize = GET_MODE_UNIT_BITSIZE (mode); + + PUT_MODE (all-reg, mode); + PUT_MODE (all-plus, mode); + PUT_MODE (all-neg, mode); + PUT_MODE (all-mult, mode); + PUT_MODE (all-sdiv, mode); + PUT_MODE (all-udiv, mode); + PUT_MODE (all-sdiv_32, mode); + PUT_MODE (all-smod_32, mode); + PUT_MODE (all-wide_trunc, mode); + PUT_MODE (all-shift, mode); + PUT_MODE (all-shift_mult, mode); + PUT_MODE (all-shift_add, mode); + PUT_MODE (all-shift_sub0, mode); + PUT_MODE (all-shift_sub1, mode); + + add_cost[speed][mode] = set_src_cost (all-plus, speed); + neg_cost[speed][mode] = set_src_cost (all-neg, speed); + mul_cost[speed][mode] = set_src_cost (all-mult, speed); + sdiv_cost[speed][mode] = set_src_cost (all-sdiv, speed); + udiv_cost[speed][mode] = set_src_cost (all-udiv, speed); + + sdiv_pow2_cheap[speed][mode] = (set_src_cost (all-sdiv_32, speed) + = 2 * add_cost[speed][mode]); + smod_pow2_cheap[speed][mode] = (set_src_cost (all-smod_32, speed) + = 4 * add_cost[speed][mode]); + + shift_cost[speed][mode][0] = 0; + shiftadd_cost[speed][mode][0] = shiftsub0_cost[speed][mode][0] += shiftsub1_cost[speed][mode][0] = add_cost[speed][mode]; + + n = MIN (MAX_BITS_PER_WORD, mode_bitsize); + for (m = 1; m n; m++) +{ + XEXP (all-shift, 1) = all-cint[m]; + XEXP (all-shift_mult, 1) = all-pow2[m]; + shift_cost[speed][mode][m] = set_src_cost (all-shift, speed); + shiftadd_cost[speed][mode][m] = set_src_cost (all-shift_add, speed); + shiftsub0_cost[speed][mode][m] = set_src_cost (all-shift_sub0, speed); + shiftsub1_cost[speed][mode][m] = set_src_cost (all-shift_sub1, speed); +} - for (m = 1; m MAX_BITS_PER_WORD; m++) + if (SCALAR_INT_MODE_P (mode)) { - pow2[m] = GEN_INT ((HOST_WIDE_INT) 1 m); - cint[m] = GEN_INT (m); + enum machine_mode wider_mode = GET_MODE_WIDER_MODE (mode); + + if (wider_mode != VOIDmode) + { + PUT_MODE (all-zext, wider_mode); + PUT_MODE (all-wide_mult, wider_mode); + PUT_MODE (all-wide_lshr, wider_mode); + XEXP (all-wide_lshr, 1) = GEN_INT (mode_bitsize); + + mul_widen_cost[speed][wider_mode] + =
[PATCH 3/3] Handle const_vector in mulv4si3 for pre-sse4.1.
--- gcc/config/i386/i386-protos.h |1 + gcc/config/i386/i386.c| 76 + gcc/config/i386/predicates.md |7 gcc/config/i386/sse.md| 72 +++--- 4 files changed, 97 insertions(+), 59 deletions(-) diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index f300a56..431db6c 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -222,6 +222,7 @@ extern void ix86_expand_reduc (rtx (*)(rtx, rtx, rtx), rtx, rtx); extern void ix86_expand_vec_extract_even_odd (rtx, rtx, rtx, unsigned); extern bool ix86_expand_pinsr (rtx *); +extern void ix86_expand_sse2_mulv4si3 (rtx, rtx, rtx); /* In i386-c.c */ extern void ix86_target_macros (void); diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 578a756..0dc08f3 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -38438,6 +38438,82 @@ ix86_expand_vec_extract_even_odd (rtx targ, rtx op0, rtx op1, unsigned odd) expand_vec_perm_even_odd_1 (d, odd); } +void +ix86_expand_sse2_mulv4si3 (rtx op0, rtx op1, rtx op2) +{ + rtx op1_m1, op1_m2; + rtx op2_m1, op2_m2; + rtx res_1, res_2; + + /* Shift both input vectors down one element, so that elements 3 + and 1 are now in the slots for elements 2 and 0. For K8, at + least, this is faster than using a shuffle. */ + op1_m1 = op1 = force_reg (V4SImode, op1); + op1_m2 = gen_reg_rtx (V4SImode); + emit_insn (gen_sse2_lshrv1ti3 (gen_lowpart (V1TImode, op1_m2), +gen_lowpart (V1TImode, op1), +GEN_INT (32))); + + if (GET_CODE (op2) == CONST_VECTOR) +{ + rtvec v; + + /* Constant propagate the vector shift, leaving the dont-care +vector elements as zero. */ + v = rtvec_alloc (4); + RTVEC_ELT (v, 0) = CONST_VECTOR_ELT (op2, 0); + RTVEC_ELT (v, 2) = CONST_VECTOR_ELT (op2, 2); + RTVEC_ELT (v, 1) = const0_rtx; + RTVEC_ELT (v, 3) = const0_rtx; + op2_m1 = gen_rtx_CONST_VECTOR (V4SImode, v); + op2_m1 = force_reg (V4SImode, op2_m1); + + v = rtvec_alloc (4); + RTVEC_ELT (v, 0) = CONST_VECTOR_ELT (op2, 1); + RTVEC_ELT (v, 2) = CONST_VECTOR_ELT (op2, 3); + RTVEC_ELT (v, 1) = const0_rtx; + RTVEC_ELT (v, 3) = const0_rtx; + op2_m2 = gen_rtx_CONST_VECTOR (V4SImode, v); + op2_m2 = force_reg (V4SImode, op2_m2); +} + else +{ + op2_m1 = op2 = force_reg (V4SImode, op2); + op2_m2 = gen_reg_rtx (V4SImode); + emit_insn (gen_sse2_lshrv1ti3 (gen_lowpart (V1TImode, op2_m2), +gen_lowpart (V1TImode, op2), +GEN_INT (32))); +} + + /* Widening multiply of elements 0+2, and 1+3. */ + res_1 = gen_reg_rtx (V4SImode); + res_2 = gen_reg_rtx (V4SImode); + emit_insn (gen_sse2_umulv2siv2di3 (gen_lowpart (V2DImode, res_1), +op1_m1, op2_m1)); + emit_insn (gen_sse2_umulv2siv2di3 (gen_lowpart (V2DImode, res_2), +op1_m2, op2_m2)); + + /* Move the results in element 2 down to element 1; we don't care + what goes in elements 2 and 3. Then we can merge the parts + back together with an interleave. + + Note that two other sequences were tried: + (1) Use interleaves at the start instead of psrldq, which allows + us to use a single shufps to merge things back at the end. + (2) Use shufps here to combine the two vectors, then pshufd to + put the elements in the correct order. + In both cases the cost of the reformatting stall was too high + and the overall sequence slower. */ + + emit_insn (gen_sse2_pshufd_1 (res_1, res_1, const0_rtx, const2_rtx, + const0_rtx, const0_rtx)); + emit_insn (gen_sse2_pshufd_1 (res_2, res_2, const0_rtx, const2_rtx, + const0_rtx, const0_rtx)); + res_1 = emit_insn (gen_vec_interleave_lowv4si (op0, res_1, res_2)); + + set_unique_reg_note (res_1, REG_EQUAL, gen_rtx_MULT (V4SImode, op1, op2)); +} + /* Expand an insert into a vector register through pinsr insn. Return true if successful. */ diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md index 92db809..f23e932 100644 --- a/gcc/config/i386/predicates.md +++ b/gcc/config/i386/predicates.md @@ -816,6 +816,13 @@ return false; }) +;; Return true when OP is a nonimmediate or a vector constant. Note +;; that most vector constants are not legitimate operands, so we need +;; to special-case this. +(define_predicate nonimmediate_or_const_vector_operand + (ior (match_code const_vector) + (match_operand 0 nonimmediate_operand))) + ;; Return true if OP is a register or a zero. (define_predicate reg_or_0_operand (ior (match_operand 0 register_operand) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 6a8206a..1f6fdb4 100644 ---
Re: long long availability in host compiler (Re: constant that doesn't fit in 32bits in alpha.c)
Eric == Eric Botcazou ebotca...@adacore.com writes: Tom I don't understand what the code being external, or the review, has to Tom do with anything. This code is compiled with the same host compiler as Tom everything else. Eric But, precisely, this line of reasoning is barely defensible in my Eric opinion. If you really want to go that route, then let's stop Eric doing comprehensive reviews and stop requesting changes to Eric submitted patches in order to make them comply with the Eric agreed-upon practices, that would save time for everyone. I never suggested anything like this. I suppose you are arguing ad absurdum here, but I don't think that this conclusion follows from the antecedents. I'm merely supporting Pedro's discovery that a rule, previously thought to have been important, was found by accident not to matter. Tom HOST_WIDE_INT is also not very persuasive to me. We did many things in Tom the past that became obsolete as compilers matured. Eric Why would HOST_WIDE_INT be obsolete? That's a nice way to Eric abstract the host and reverting to hardcoded types like 'long Eric long' doesn't seem a progress to me. Yes, ok. I like typedefs too. I misunderstood what you were saying here. Tom
Re: long long availability in host compiler (Re: constant that doesn't fit in 32bits in alpha.c)
On Jun 15, 2012, at 1:11 PM, Eric Botcazou wrote: Why would HOST_WIDE_INT be obsolete? For the same reason that we don't use HOST_NARROW_INT instead of int. In practice, int is portable enough for us now. In reality, long long is portable for us now. 20 years ago, it wasn't portable enough. Times change. What's changed? We'll we now have a language standard for long long, other implementors have had a chance to implement that standard, systems have had a chance to update and provide implementations of that standard and systems that don't support have died from hardware failure or have been scraped because they consume too much electricity to be useful anymore. This situation is more like prototypes than patch reviews. See patch reviews are useful for catching code bugs, HOST_WIDE_INT is not as useful as catching code bugs. Prototypes used to be new fangled things that very few compiler had. One could not portably use them. Guess what, times change, compilers implement them, language standards adopt them, and system vendors provide implementations that support them. The systems that never supported them go away, the people that know of a world in which compilers that don't support prototypes die. We allow portability hacks into the source base for important system (implementations) were we don't want to just nix a platform wholesale. See things like: /* This was a conditional expression but it triggered a bug in Sun C 5.5. */ in the source base. We do this, not for some theoretic beauty but for very practical and pragmatic reasons. In time, even the above can be safely removed. We have already removed support for prototypes (not being supported), and yet, we still have patch reviews. So, to be practical, let us list the systems, platforms and implementations we are thinking of nixing, if we require long long to support at least 64-bit math. Let me start:, ok, I'm done, now it is your turn. I'm fine for avoiding long long, if there is a system people want to support that needs it, I am merely ignorant of such a system. That's a nice way to abstract the host Yes, but why abstract the host? HOST_NARROW_INT is a nice way to abstract the host as well, that is a necessary but not sufficient reason. We do it to support an actual, real system, platform or implementation that fails to provide long long. When there are no longer any such systems, then the time is right to switch to the standard. Now, why do we do this, because we prefer standards to aide in readability and portability. A person new to gcc, but knows C or C++ knows what long long is. HOST_WIDE_INT, well, they have to take a mental hit on and figure it out, if they care.
Re: C++ PATCH for c++/53484 (wrong auto in template)
Back when we added C++11 auto deduction, I thought we could shortcut the normal deduction in some templates, when the type is adequately describable (thus the late, unlamented function describable_type). Over time various problems with this have arisen, of which this is the most recent; as a result, I'm giving up the attempt as a bad idea and just deferring auto deduction if the initializer is type-dependent. ... This has caused http://gcc.gnu.org/ml/gcc-patches/2012-06/msg00085.html Dominique
Re: [PR49888, VTA] don't keep VALUEs bound to modified MEMs
On Jun 14, 2012, H.J. Lu hjl.to...@gmail.com wrote: On Tue, Jun 12, 2012 at 1:42 PM, Richard Henderson r...@redhat.com wrote: On 2012-06-05 12:33, Alexandre Oliva wrote: for gcc/ChangeLog from Alexandre Oliva aol...@redhat.com PR debug/49888 * var-tracking.c: Include alias.h. (overlapping_mems): New struct. (drop_overlapping_mem_locs): New. (clobber_overlapping_mems): New. (var_mem_delete_and_set, var_mem_delete): Call it. (val_bind): Likewise, but only if modified. (compute_bb_dataflow, emit_notes_in_bb): Call it on MEMs. * Makefile.in (var-tracking.o): Depend in $(ALIAS_H). for gcc/testsuite/ChangeLog from Alexandre Oliva aol...@redhat.com PR debug/49888 * gcc.dg/guality/pr49888.c: New. Ok. It caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53671 I see a few of these failures myself. They're in my test results. I guess I was so caught up in assessing the debug info quality changes with this patch that I completely failed to look at the test results, because they've been around for a while, and I have a few pristine-build baselines in between. Apologies for this mistake. Anyway... The problem is not too hard to understand, but it may be somewhat hard to fix. Basically, pushing registers to save them on the stack implies writes that are currently thought to conflict with the MEMs holding incoming arguments, and apparently there isn't enough information in the cselib static table for us to realize the write doesn't alias with any of the incoming arguments. Using the dynamic tables during alias testing is one possibility I'm looking into, but this won't be trivial and it could get expensive; another, that has just occurred to me while composing this message, is to use the cselib static table itself, for it *should* have enough info for us to realize that argp and sp offset are related and, given proper offsets, non-overlapping. Now, neither approach is going to be an immediate fix. Should I revert the patch, or can we live with some debug info completeness regressions for a bit? I wouldn't mind reverting it, but I won't unless the broken patch is actually causing trouble to any of us. Again, sorry about the breakage. -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist Red Hat Brazil Compiler Engineer
Re: [RFC 0/3] Stuff related to pr53533
On Fri, Jun 15, 2012 at 3:06 PM, Richard Henderson r...@redhat.com wrote: On 2012-06-15 14:42, H.J. Lu wrote: Latency/throughput info is in Intel optimization reference manual. Which instructions aren't covered? Ok, good. The rather old opt ref manual that I had didn't cover these. The one I downloaded this afternoon does. And it seems that we would need new costs fields to properly model the more recent intel cpus. I don't suppose I could convince you or your cohorts to add some new fields to the cost struct to handle vector integer operations? Prolly only need logical and mult? Hi Areg, Can you look into it? Thanks. -- H.J.
Re: long long availability in host compiler (Re: constant that doesn't fit in 32bits in alpha.c)
On Fri, 15 Jun 2012, Mike Stump wrote: On Jun 15, 2012, at 2:46 PM, Joseph S. Myers wrote: HOST_WIDE_INT is an abstraction about the *target*; the target determines the required properties. The salient properties include: * At least as wide as target address space. The first person to do a 128 bit address support isn't going to appreciate all the work they are going to have to do. With some luck, before then, we will have switched to C++ and engineered in some prettier interfaces that will just work with no changes. Today, it would be a major pain. Well, you'll want 128-bit HOST_WIDE_INT to manipulate object sizes etc. for a 128-bit target. Say the compiler used for the host is GCC. If the host is 64-bit, you have __int128 and unsigned __int128 available (with older GCC, __int128_t and __uint128_t). There are just a couple of problems with those types, one a technical standards issue and one more serious as a practical issue: * They are sui generic types that act quite like integer types but aren't actually integer types, because of the host ABI defining intmax_t as 64-bit. * They lack any printf support (at least with glibc), and such support is needed by GCC for HOST_WIDE_INT. Given control over the host C library, both issues could be addressed by changing intmax_t on the host to 128 bits - printf %j formats would then be appropriate for 128-bit types. (You'd need a host GCC change as well to define an integer constant suffix for 128-bit constants.) That certainly ought to be practical in glibc with symbol versioning if desired - no worse than the way various architectures moved to 128-bit long double. You'd probably also find places in GCC that assume that HOST_WIDE_INT is either 32-bit or 64-bit, but I expect it would be straightforward to adjust those to support 128-bit as well. True, C++ may make it possible to use something other than a built-in integer-like type of the host compiler, but I don't think that's needed for this. (I'm not particularly concerned about 32-bit host support for this hypothetical 128-bit target; anyway, it should be practical to support __int128 for 32-bit systems, with some work on the libgcc side of things.) * Constants for the target can be represented in at most two HOST_WIDE_INT. This is nice in theory but no longer true for some of us. :-( This could also reasonably be cleaned up separately from other uses of HOST_WIDE_INT. Maybe what's really wanted is some abstraction for wide target constants - which usually would be much like double-int.[ch], but for targets that need it would be larger. Certainly the const_int / const_double division - where const_double represents *either* wide integers *or* floating-point constants - is an ugly interface, and it would be better for integers of whatever size to be const_int and const_double only to be floating point. -- Joseph S. Myers jos...@codesourcery.com