Re: [Patch] Adjust diag-scans in vect-tests to fix fails on AVX/AVX2
The problem is that when vect_multiple_sizes is true, then no correct number exist (at least, theoretically). That's because number of diagnostic messages depends on number of available vector sizes - for now this number is usually 2 (on x86 it's 256 and 128 bit vectors), so we could change 'xfail' to 'target'. But when wider vectors become available (512 bit), there will be fails again. On 12 December 2011 11:46, Jakub Jelinek ja...@redhat.com wrote: On Mon, Dec 12, 2011 at 11:06:37AM +0400, Michael Zolotukhin wrote: diff --git a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c index 21b87a3..f75253e 100644 --- a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c +++ b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c @@ -88,5 +88,6 @@ int main (void) /* { dg-final { scan-tree-dump-times vectorized 4 loops 1 vect } } */ /* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 0 vect } } */ -/* { dg-final { scan-tree-dump-times Alignment of access forced using peeling 2 vect } } */ +/* { dg-final { scan-tree-dump-times Alignment of access forced using peeling 2 vect { target {! vect_multiple_sizes} } } } */ +/* { dg-final { scan-tree-dump-times Alignment of access forced using peeling 2 vect { xfail vect_multiple_sizes} } } */ /* { dg-final { cleanup-tree-dump vect } } */ The xfails are IMHO undesriable, then you just stop testing those tests on very common developer platforms. IMHO you should just use different dump-times count for the vect_multipl_sizes (after checking it is the right count), if it doesn't depend on -mprefer-avx128 vs. -mno-prefer-avx128. If it does, then perhaps we want a predicate that details the vectorization factors and their order. Jakub -- --- Best regards, Michael V. Zolotukhin, Software Engineer Intel Corporation.
Re: [Patch] Adjust diag-scans in vect-tests to fix fails on AVX/AVX2
On Mon, Dec 12, 2011 at 12:00:47PM +0400, Michael Zolotukhin wrote: The problem is that when vect_multiple_sizes is true, then no correct number exist (at least, theoretically). That's because number of diagnostic messages depends on number of available vector sizes - for now this number is usually 2 (on x86 it's 256 and 128 bit vectors), so we could change 'xfail' to 'target'. But when wider vectors become available (512 bit), there will be fails again. Which is why introducing vect_multiple_sizes_32B_16B (for -mno-prefer-128) and vect_multiple_sizes_16B_32B (for -mprefer-128) and using it in the tests could solve it. Jakub
Re: [committed] 4 backports from trunk to 4.6 branch
Jakub Jelinek writes: On Sun, Dec 11, 2011 at 02:48:52PM +0100, Mikael Pettersson wrote: This patch, r182112 on 4.6 branch, caused a test suite regression on arm-linux-gnueabi: +FAIL: gcc.c-torture/execute/20050713-1.c compilation, -O2 (internal compiler error) +UNRESOLVED: gcc.c-torture/execute/20050713-1.c execution, -O2 +FAIL: gcc.c-torture/execute/20050713-1.c compilation, -Os (internal compiler error) +UNRESOLVED: gcc.c-torture/execute/20050713-1.c execution, -Os because the compiler now ICEs: 20050713-1.c: In function 'bar3': 20050713-1.c:38:3: internal compiler error: in calculate_allocation, at vec.c:183 The same ICE also happens with today's trunk. Please file it into bugzilla (and the other bug too), I'll have a look. Jakub Done, they are PR51510 and PR51511.
Re: [Patch] Adjust diag-scans in vect-tests to fix fails on AVX/AVX2
Hello! This patch fixes dg-final scans in tests from vect.exp suite, which currently fail when avx2 is used. --- a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c +++ b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c @@ -88,5 +88,6 @@ int main (void) /* { dg-final { scan-tree-dump-times vectorized 4 loops 1 vect } } */ /* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 0 vect } } */ -/* { dg-final { scan-tree-dump-times Alignment of access forced using peeling 2 vect } } */ +/* { dg-final { scan-tree-dump-times Alignment of access forced using peeling 2 vect { target {! vect_multiple_sizes} } } } */ +/* { dg-final { scan-tree-dump-times Alignment of access forced using peeling 2 vect { xfail vect_multiple_sizes} } } */ /* { dg-final { cleanup-tree-dump vect } } */ Please do not add xfails through the patch, xfail means that a problem was identified and will someday be fixed. In the above case, just add target condition, no need for xfailed scan. If I'm not missing simething, you can probably remove all introduced xfails, just add new target conditions. # Return 1 if avx instructions can be compiled. +proc check_effective_explicit_target_avx { } { + return [check_no_messages_and_pattern e_avx !__builtin_ia32_vzeroall assembly { +void _mm256_zeroall (void) +{ + __builtin_ia32_vzeroall (); +} + } -O2 ] +} Please use # Return true if we are compiling for AVX target. proc check_avx_available { } { return [check_no_compiler_messages avx_available assembly { #ifndef __AVX__ #error unsupported #endif } ] } Uros.
[Patch, Darwin, Committed] fix over-length section name.
section names can be at most 16 characters for mach-o; I applied the following as obvious (r182220) cheers Iain gcc: * config/darwin-sections.def (zobj_const_data_section): Fix over- length section name. Index: gcc/config/darwin-sections.def === --- gcc/config/darwin-sections.def (revision 182219) +++ gcc/config/darwin-sections.def (working copy) @@ -76,7 +76,7 @@ DEF_SECTION (const_data_coal_section, SECTION_NO_A .section __DATA,__const_coal,coalesced, 0) /* Place to put zero-sized to avoid issues with section anchors. */ DEF_SECTION (zobj_const_data_section, SECTION_NO_ANCHOR, -.section\t__DATA,__zobj_const_data, 0) +.section\t__DATA,__zobj_cnst_data, 0) /* Strings and other literals. */ DEF_SECTION (cstring_section, SECTION_MERGE | SECTION_STRINGS, .cstring, 0)
Re: [PATCH] Fix vectorizer ICEs with calls with MEM_REF arguments (PR tree-optimization/51485)
On Sun, 11 Dec 2011, Ira Rosen wrote: On 9 December 2011 19:08, Jakub Jelinek ja...@redhat.com wrote: Hi! As mentioned in the PR, we ICE on the following testcase, because there are DRs in a GIMPLE_CALL stmt and when there is just one, we compute vectype for the call as if it were a load or store, but during computation of vectorization factor we only consider the return value of the call. As such calls are not vectorizable anyway, the following patch just gives up on them. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk (and with the if (bb_vinfo)/if (gather) parts removed for 4.6 too)? OK for trunk. Also for the branch. Thanks, Richard. Thanks, Ira 2011-12-09 Jakub Jelinek ja...@redhat.com PR tree-optimization/51485 * tree-vect-data-refs.c (vect_analyze_data_refs): Give up on DRs in call stmts. * g++.dg/vect/pr51485.cc: New test. --- gcc/tree-vect-data-refs.c.jj 2011-12-02 01:52:26.325893329 +0100 +++ gcc/tree-vect-data-refs.c 2011-12-09 13:27:29.726668859 +0100 @@ -2896,6 +2896,26 @@ vect_analyze_data_refs (loop_vec_info lo return false; } + if (is_gimple_call (stmt)) + { + if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS)) + { + fprintf (vect_dump, not vectorized: dr in a call ); + print_gimple_stmt (vect_dump, stmt, 0, TDF_SLIM); + } + + if (bb_vinfo) + { + STMT_VINFO_VECTORIZABLE (stmt_info) = false; + stop_bb_analysis = true; + continue; + } + + if (gather) + free_data_ref (dr); + return false; + } + /* Update DR field in stmt_vec_info struct. */ /* If the dataref is in an inner-loop of the loop that is considered for --- gcc/testsuite/g++.dg/vect/pr51485.cc.jj 2011-12-09 13:28:45.155281405 +0100 +++ gcc/testsuite/g++.dg/vect/pr51485.cc 2011-12-09 13:28:57.692205773 +0100 @@ -0,0 +1,14 @@ +/* { dg-do compile } */ + +struct A { A (); unsigned int a; }; +double bar (A a) throw () __attribute__((pure)); + +void +foo (unsigned int x, double *y, A *z) +{ + unsigned int i; + for (i = 0; i x; i++) + y[i] = bar (z[i]); +} + +/* { dg-final { cleanup-tree-dump vect } } */ Jakub -- Richard Guenther rguent...@suse.de SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer
Re: warn about deprecated access declarations
Jonathan Wakely jwakely@gmail.com writes: On 11 December 2011 22:22, Fabien Chêne wrote: Consequently, I propose to deprecate them with a warning, as clang already does. So that you get a warning for the following code: struct A { int i; }; struct B : A { A::i; // - warning here }; warning: access declarations are deprecated; employ using declarations instead [-Wdeprecated] Whether or not it's suitable for stage 3, employ feels a bit clunky in this context, how about access declarations are deprecated in favour of using-declarations ? How about ...; suggest adding the using keyword? using declarations is ambigous, it is not clear that using means the keyword here. Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
Re: warn about deprecated access declarations
On 12 December 2011 09:18, Andreas Schwab wrote: Jonathan Wakely jwakely@gmail.com writes: On 11 December 2011 22:22, Fabien Chêne wrote: Consequently, I propose to deprecate them with a warning, as clang already does. So that you get a warning for the following code: struct A { int i; }; struct B : A { A::i; // - warning here }; warning: access declarations are deprecated; employ using declarations instead [-Wdeprecated] Whether or not it's suitable for stage 3, employ feels a bit clunky in this context, how about access declarations are deprecated in favour of using-declarations ? How about ...; suggest adding the using keyword? That sounds like the compiler is suggesting that the user suggests doing that! using declarations is ambigous, it is not clear that using means the keyword here. That's why I put the hyphen in using-declarations :-) but this is turning into a bike shed issue.
Re: [patch] Remove occurrences of int64_t (and int32_t)
On Sat, Dec 10, 2011 at 6:23 PM, Eric Botcazou ebotca...@adacore.com wrote: Hi, this removes all the occurrences of int64_t in the host code, as well as some gratuitous occurrences of int32_t (there are real ones in DFP and LTO code). Tested on i586-suse-linux and x86_64-suse-linux. Any objections? Are the LTO files present in the gcc directory compiled when LTO is disabled? Yes they are, but they will be unused at runtime. If so, a compiler with a 64-bit type is required on the host since GCC 4.5.0. - size = (HOST_BITS_PER_WIDE_INT = 64) - ? (uint64_t) int_size_in_bytes (TREE_TYPE (t)) - : (((uint64_t) TREE_INT_CST_HIGH (DECL_SIZE_UNIT (t))) 32) - | TREE_INT_CST_LOW (DECL_SIZE_UNIT (t)); +#if HOST_BITS_PER_WIDE_INT = 64 + size = (unsigned host_int64) int_size_in_bytes (TREE_TYPE (t)); +#else + size + = (unsigned host_int64) TREE_INT_CST_HIGH (DECL_SIZE_UNIT (t)) 32 + || (unsigned host_int64) TREE_INT_CST_LOW (DECL_SIZE_UNIT (t)); +#endif and this pattern looks bogus anyway (using TYPE_SIZE vs. DECL_SIZE). Please simply switch it to unconditional use of tree_to_double_int (DECL_SIZE_UNIT (t)).low and make 'size' a HOST_WIDE_INT (we properly require 64 bit hwi for targets that have 64bit sizes/pointers). +#if HOST_BITS_PER_WIDE_INT = 64 +# define host_int64 HOST_WIDE_INT +#elif HOST_BITS_PER_WIDEST_INT = 64 +# define host_int64 HOST_WIDEST_INT +#else +# error host has no 64-bit type +#endif well, as previous communication has shown we should use HOST_WIDEST_INT unconditionally for a 64-bit type (allowing the code to compile when no such type is available during stage1). If we really need a true 64bit type then we should amend hwint.h accordingly. Otherwise ok (the s/int32_t/int/ cases are obvious). Thanks, Richard. 2011-12-10 Eric Botcazou ebotca...@adacore.com * lto-streamer-out.c (write_symbol): Use proper 64-bit host type. * lto-cgraph.c (input_cgraph_opt_section): Use 'int' for offsets. * lto-streamer-in.c (lto_read_body): Likewise. (lto_input_toplevel_asms): Likewise. * lto-section-in.c (lto_create_simple_input_block): Likewise. * ipa-inline-analysis.c (inline_read_section): Likewise. * ipa-prop.c (ipa_prop_read_section): Likewise. lto/ * lto.h (lto_parse_hex): Delete. * lto.c (lto_read_decls): Use 'int' for offsets. (lto_parse_hex): Make static and return proper 64-bit host type. (lto_resolution_read): Use proper 64-bit host type. -- Eric Botcazou
Re: [patch] PR51388
On Sun, Dec 11, 2011 at 6:03 PM, Steven Bosscher stevenb@gmail.com wrote: Hello, The configure scripts check for -Wno-narrowing, but GCC ignores rather than rejects unknown -Wno-* warnings. Fixed by checking for the positive warning, -Wnarrowing. OK for trunk? But that will now pass -Wnarrowing instead of -Wno-narrowing to the build. So I think the fix should be done to ACX_PROG_CC_WARNING_OPTS which should strip 'no-' from the option before checking it (well, possibly testing both the -W and the -Wno- variant) and append the -Wno- variant. Richard. Ciao! Steven
Re: [patch] Fix PR tree-optimization/50569
Hi, On Sat, Dec 10, 2011 at 10:31:23PM +0100, Eric Botcazou wrote: Hi, this is a regression present on mainline and 4.6 branch at -O for the SPARC. The compiler again generates an unaligned access for the memcpy calls in: struct event { struct { unsigned int sec; } sent __attribute__((packed)); }; void __attribute__((noinline,noclone)) frob_entry(char *buf) { struct event event; __builtin_memcpy(event, buf, sizeof(event)); if (event.sent.sec 64) { event.sent.sec = -1U; __builtin_memcpy(buf, event, sizeof(event)); } } I believe there are many manifestation of this issue, the ones I track are PR 50052 and PR 50444 which has even a x86_64 SSE testcase. Unsurprisingly enough, the trick used in build_ref_for_model (in case this is a reference to a component, the function will replicate the last COMPONENT_REF of model's expr to access it) isn't sufficient anymore with MEM_REFs around, since MEM_REFs can encapsulate an arbitrary number of inner references. Fixed by extending the trick to chain of COMPONENT_REFs. Well, I can live with this change (though I cannot approve anything). On the other hand, the real underlying problem is that expander cannot handle unaligned MEM_REFs where strict alignment is required. SRA is of course much more prone to create such situations than anything else but I wonder whether they can creep up elsewhere too. It also takes us in the opposite direction than the one initially intended with MEM_REFs, doesn't it? That said, I looked into the expander briefly in summer but given my level of experience in that area I did not nearly have enough time. I still plan to look into this issue in expander but for the same reasons I cannot guarantee any quick success. So I acknowledge this is the only working approach to a long-standing difficult bug... and most probably the most appropriate for the 4.6 branch. However, since we have them, shouldn't we use stack-based vectors to handle the stack of COMPONENT_REFs? Thanks, Martin Tested on x86/Linux and SPARC/Solaris, OK for mainline and 4.6 branch? 2011-12-10 Eric Botcazou ebotca...@adacore.com PR tree-optimization/50569 * tree-sra.c (build_ref_for_model): Replicate a chain of COMPONENT_REFs in the expression of MODEL instead of just the last one. 2011-12-10 Eric Botcazou ebotca...@adacore.com * gcc.c-torture/execute/20111210-1.c! New test. -- Eric Botcazou /* PR tree-optimization/50569 */ /* Reported by Paul Koning pkon...@gcc.gnu.org */ /* Reduced testcase by Mikael Pettersson mi...@it.uu.se */ struct event { struct { unsigned int sec; } sent __attribute__((packed)); }; void __attribute__((noinline,noclone)) frob_entry(char *buf) { struct event event; __builtin_memcpy(event, buf, sizeof(event)); if (event.sent.sec 64) { event.sent.sec = -1U; __builtin_memcpy(buf, event, sizeof(event)); } } int main(void) { union { char buf[1 + sizeof(struct event)]; int align; } u; __builtin_memset(u, 0, sizeof u); frob_entry(u.buf[1]); return 0; } Index: tree-sra.c === --- tree-sra.c(revision 182102) +++ tree-sra.c(working copy) @@ -1493,32 +1493,61 @@ build_ref_for_offset (location_t loc, tr } /* Construct a memory reference to a part of an aggregate BASE at the given - OFFSET and of the same type as MODEL. In case this is a reference to a - component, the function will replicate the last COMPONENT_REF of model's - expr to access it. GSI and INSERT_AFTER have the same meaning as in - build_ref_for_offset. */ + OFFSET and of the type of MODEL. In case this is a chain of references + to component, the function will replicate the chain of COMPONENT_REFs of + the expression of MODEL to access it. GSI and INSERT_AFTER have the same + meaning as in build_ref_for_offset. */ static tree build_ref_for_model (location_t loc, tree base, HOST_WIDE_INT offset, struct access *model, gimple_stmt_iterator *gsi, bool insert_after) { + tree type = model-type, t; + VEC(tree,heap) *stack = NULL; + if (TREE_CODE (model-expr) == COMPONENT_REF) { - tree t, exp_type, fld = TREE_OPERAND (model-expr, 1); - tree cr_offset = component_ref_field_offset (model-expr); + tree expr = model-expr; + + /* Create a stack of the COMPONENT_REFs so later we can walk them in + order from inner to outer. */ + stack = VEC_alloc (tree, heap, 6); + + do { + tree field = TREE_OPERAND (expr, 1); + tree cr_offset = component_ref_field_offset (expr); + gcc_assert (cr_offset host_integerp (cr_offset, 1)); + + offset -= TREE_INT_CST_LOW (cr_offset) * BITS_PER_UNIT; + offset -=
Re: [patch] Remove occurrences of int64_t (and int32_t)
@@ -993,7 +1005,7 @@ lto_resolution_read (splay_tree file_ids { int t; char offset_p[17]; - int64_t offset; + host_int64 offset; t = fscanf (resolution, @0x%16s, offset_p); if (t != 1) internal_error (could not parse file offset); This should be off_t, I believe. -- Janne Blomqvist
Re: [patch libjava]: Fix for PR libgcj/50053
On 12/10/2011 12:05 PM, Kai Tietz wrote: 2011-12-10 Kai Tietz kti...@redhat.com PR libgcj/50053 * java/lang/natClass.cc (java::lang::Class::newInstance): Special case member-call for 32-bit IA native Window target. OK, thanks. Andrew.
Re: warn about deprecated access declarations
Jonathan Wakely jwakely@gmail.com writes: On 12 December 2011 09:18, Andreas Schwab wrote: Jonathan Wakely jwakely@gmail.com writes: On 11 December 2011 22:22, Fabien Chêne wrote: Consequently, I propose to deprecate them with a warning, as clang already does. So that you get a warning for the following code: struct A { int i; }; struct B : A { A::i; // - warning here }; warning: access declarations are deprecated; employ using declarations instead [-Wdeprecated] Whether or not it's suitable for stage 3, employ feels a bit clunky in this context, how about access declarations are deprecated in favour of using-declarations ? How about ...; suggest adding the using keyword? That sounds like the compiler is suggesting that the user suggests doing that! It is similar to suggest parentheses Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
Re: [patch] Remove occurrences of int64_t (and int32_t)
On Mon, Dec 12, 2011 at 12:02, Janne Blomqvist blomqvist.ja...@gmail.com wrote: @@ -993,7 +1005,7 @@ lto_resolution_read (splay_tree file_ids { int t; char offset_p[17]; - int64_t offset; + host_int64 offset; t = fscanf (resolution, @0x%16s, offset_p); if (t != 1) internal_error (could not parse file offset); This should be off_t, I believe. Bah, scratch that. My bad. -- Janne Blomqvist
Re: [PATCH 5/6] mips: Implement vec_perm_const.
Richard Henderson r...@redhat.com writes: On 12/11/2011 04:50 AM, Richard Sandiford wrote: [Mingjie, please could you help with the Loongson question near the end?] Actually, can you tell me how to test these abi combinations? I keep trying to use mips-sim or mips64-sim and get linker errors complaining of abi combinations. I tend to use mips64{,el}-linux-gnu with a hacked-up QEMU (hacked up to add MIPS16 to the cpu model, which isn't relevant here). But I'm surprised *-elf is causing problems. Something like mipsisa64-elfoabi ought to just work (I last tested that a few weeks ago). Little-endian: The semantics of the RTL pattern are: { 0L, 0U } = { X[I3], X[I4 + 2] }, where X = { 1L, 1U, 2L, 2U } so: 0L = { 1L, 1U }[I3] (= bopbUL) 0U = { 2L, 2U }[I4] (= aopaUL) aop = 2, aUL = I4 ? U : L bop = 1, bUL = I3 ? U : L [LL] !I4 !I3 [UL] I4 !I3 [LU] !I4 I3[UU] I4 I3 Big-endian: The semantics of the RTL pattern are: { 0U, 0L } = { X[I3], X[I4 + 2] }, where X = { 1U, 1L, 2U, 2L } so: 0U = { 1U, 1L }[I3] (= aopaUL) 0L = { 2U, 2L }[I4] (= bopbUL) aop = 1, aUL = I3 ? L : U bop = 2, bUL = I4 ? L : U [UU] !I3 !I4 [UL] !I3 I4 [LU] I3 !I4[LL] I3 I4. */ which suggests that the PUL and PLU entries for big-endian should be the other way around. Does that sound right, or have I misunderstood? Yes, that sounds right. ...for little-endian, we need to pass the U and L components of the mnemonic in the reverse order: the MIPS instruction specifies the upper part first, whereas the rtl pattern specifies the lower part first. And for little-endian, U refers to memory element 1 and L to memory element 0. So I think this should be: ... Except that the actual output of the LE insn actually swaps the operands too. So I think these expanders should not *also* swap the operands. I've tidied these up a bit since then. Hmm, are you sure? The order of the operands passed to these p?? expanders is supposed to match the order of the operands in the final asm instruction. A user's A = __builtin_mips_plu_ps (B, C) corresponds to gen_mips_plu_ps (A, B, C), which must always generate PLU.PS A, B, C, etc. So if the define_insn swaps the operands (which from above, it must for little-endian), then these expanders need to swap too, to undo the effect. Or, taking the longer version from yesterday: ;; Expanders for builtins. The instruction: ;; ;; P[UL][UL].PS result, a, b ;; ;; says that the upper part of result is taken from half of a and ;; the lower part of result is taken from half of b. This means ;; that the P[UL][UL].PS operand order matches memory order on big-endian ;; targets; a is element 0 of the V2SF result while b is element 1. ;; However, the P[UL][UL].PS operand order is the reverse of memory order ;; on little-endian targets; a is element 1 of the V2SF result while ;; b is element 0. The arguments to vec_perm_const_ps are always in ;; memory order. ;; ;; Similarly, U corresponds to element 0 on big-endian targets but ;; to element 1 on little-endian targets. (would be nice to have these comments in the patch if nothing else). Because of that, I think I preferred the original style, with no SET rtl pattern in the expander, and calls to emit_insn (gen_...) in the C code. I think this is endian-dependent. For little-endian, the bottom two bits of the mask determine element 0; for big-endian, the top two bits of the mask do. Recall that loongson can only run in little-endian. Doh. I added comments about that in the md file, but it would do no harm to add another here. Thanks. Richard
Re: [patch] add __is_final trait to fix libstdc++/51365
On 12/11/2011 04:05 PM, Jonathan Wakely wrote: ping In my opinion __is_final would be definitely useful in general, for 4.8, and 4.7 too, if isn't too late. As regards the wider issue which is being discussed on the reflector - beware, I didn't follow all the messages - 'final' disabling a nice optimization like EBO makes me very nervous. Really, doesn't seem part of the intended general philosophy in this area. There must be a way to overcome the annoyance. Last resort, if suggestions like having 'final' not forbidding private derivation cannot go through, we could imagine a GCC attribute reverting the effect of 'final' for people (library writers ;) knowing what they are doing. I don't know. Paolo.
Re: [PATCH] PR target/50038 fix: redundant zero extensions removal
Here is a patch wich introduces new pass 'ree' based on pass 'implicit_zee' as was discussed above. Thanks. 2011-11-22 Enkovich Ilya ilya.enkov...@intel.com PR target/50038 * implicit-zee.c: Delete. * ree.c: New file. * Makefile.in: Replace implicit-zee.c with ree.c. * i386.c (ix86_option_override_internal): Set flag_ree for 32 bit platform. * config/i386/i386.c (ix86_option_override_internal): ... * common.opt (fzee): Ignored. (free): New. * passes.c (init_optimization_passes): Replace pass_implicit_zee with pass_ree. * tree-pass.h (pass_implicit_zee): Delete. (pass_ree): New. * timevar.def (TV_ZEE): Delete. (TV_REE): New. It would be nice to add something to doc/invoke.texi about -free. The patch is mostly OK, but a few changes are required: +/* Problem Description : + + This pass is intended to remove redundant extension instructions. + Such instructions appeare for different reasons. We expect some of appear without terminal 'e'. + them due to implicit zero-extend in 64-bit registers after writing to zero-extension + their lower 32-bit half (as in x86_64 arch). (e.g. for the x86-64 architecture). + Another possible reason is a type cast which follows load (for instance + register restore) which can be combined into single instruction in the + most cases. Another possible reason is a type cast which follows a load (for instance a register restore) and which can be combined into a single instruction, and for which earlier local passes, e.g. the combiner, weren't able to optimize. + extension instruction that could possibly be redundant. Such extension double space after the period + For example, in x86_64, implicit zero-extensions are captured with For example, for the x86-64 architecture, implicit... + Architectures like x86_64 support conditional moves whose semantics for x86-64 + Basic ZEE pass reported reduction of the dynamic instruction count of a Let's use the wording The original redundant zero-extension elimination pass + The most performance gain from REE pass in addition to ZEE pass is expected The additional performance gain with the enhanced pass is mostly expected... +/* This structure is used to hold data about candidate for + elimination. */ + +typedef struct ext_cand +{ + rtx insn; + const_rtx ext_expr; No need to repeat the ext_ prefix, const_rtx expr; is fine. + enum machine_mode src_mode; +} *ext_cand_t; + +static alloc_pool ext_cand_pool; +/* Carry information about extensions while walking the RTL. */ + +DEF_VEC_P(ext_cand_t); +DEF_VEC_ALLOC_P(ext_cand_t, heap); The combination of a pool with a heap-allocated vector of pointers looks a little convoluted. Can't you use a vector of objects (DEF_VEC_O) directly? +/* Returns the merge code status for INSN. */ + +static enum insn_merge_code +get_insn_status (rtx insn) I know this was in the original file, but in the head comment of functions, this should be /* Return the merge code status for INSN. */ +/* Sets the merge code status of INSN to CODE. */ /* Set the merge code status of INSN to CODE. */ and so on. +/* Given a insn (CURR_INSN) and a pointer to the SET rtx (ORIG_SET) + that needs to be modified, this code modifies the SET rtx to a + new SET rtx that extends the right hand expression into a + register (NEWREG) on the left hand side. Note that multiple + assumptions are made about the nature of the set that needs + to be true for this to work and is called from merge_def_and_ext. + + Original : + (set (reg a) (expression)) + + Transform : + (set (reg a) (extend (expression))) + + Special Cases : + If the expression is a constant or another extend directly + assign it to the register. */ + +static bool +combine_set_extend (ext_cand_t cand, rtx curr_insn, rtx *orig_set) The head comment is outdated: NEWREG is gone and CAND has appeared. + /* Merge constants by directly moving the constant into the + register under some conditions. */ + + if (GET_CODE (orig_src) == CONST_INT + HOST_BITS_PER_WIDE_INT = GET_MODE_BITSIZE (dst_mode)) +{ + if (INTVAL (orig_src) = 0) + new_set = gen_rtx_SET (VOIDmode, newreg, orig_src); + else if (GET_CODE (orig_src) == ZERO_EXTEND) + { + /* Zero-extending a negative SImode integer into DImode +makes it a positive integer. Convert the given negative +integer into the appropriate integer when zero-extended. */ + + delta_width = HOST_BITS_PER_WIDE_INT - GET_MODE_BITSIZE (SImode); + mask = (~(unsigned HOST_WIDE_INT) 0) delta_width; + val = INTVAL (orig_src); + val = val mask; + new_const_int = gen_rtx_CONST_INT (VOIDmode, val); + new_set = gen_rtx_SET (VOIDmode, newreg, new_const_int); + } + else if (GET_CODE (orig_src) == SIGN_EXTEND) +
Re: [PATCH] Fix PR46796
On Fri, 9 Dec 2011, Richard Guenther wrote: This fixes PR46796 by making sure the types in the type variant chain can be looked up again using get_qualified_type. LTO bootstrap and regtest running on x86_64-unknown-linux-gnu. Actually I didn't like that patch very much and here is a much simpler and more localized variant - simply make sure the TYPE_NAMEs are entered into the streamer cache at the time we pre-load the type nodes. LTO bootstrap and regtest running on x86_64-unknown-linux-gnu. Richard. 2011-12-12 Richard Guenther rguent...@suse.de PR lto/46796 * tree-streamer.c (record_common_node): Also pre-load TYPE_NAMEs. Index: gcc/tree-streamer.c === --- gcc/tree-streamer.c (revision 182220) +++ gcc/tree-streamer.c (working copy) @@ -277,6 +277,15 @@ record_common_node (struct streamer_tree for (f = TYPE_FIELDS (node); f; f = TREE_CHAIN (f)) record_common_node (cache, f); } + + /* To make qualified type variants pass the check_qualified_type test + we have to make sure to properly share TYPE_NAME. Do so by also + pre-loading that to the cache. See PR46796. + ??? To properly preserve name differences from different frontends + we should stop pre-loading those type nodes to the cache competely + instead. */ + if (TYPE_P (node)) +record_common_node (cache, TYPE_NAME (node)); }
Re: [Patch] Adjust diag-scans in vect-tests to fix fails on AVX/AVX2
I changed xfails to target-checks - for now I use common vect_multiple_sizes (though it'll fail when wider vectors emerge). Also, I changed AVX-check to the version Uros suggested. Please check updated patch (attached). As for vect_multiple_sizes_32B_16B and similar - isn't it too target-specific? I think if we want to keep everything as general as possible, we should have something like vect_1_vector_size_available, vect_2_vector_sizes_available, etc. New changelog: 2011-12-12 Michael Zolotukhin michael.v.zolotuk...@intel.com * gcc.dg/vect/no-section-anchors-vect-31.c: Adjust diagnostic test to fix fail on AVX. * gcc.dg/vect/no-section-anchors-vect-66.c: Ditto. * gcc.dg/vect/no-section-anchors-vect-68.c: Ditto. * gcc.dg/vect/no-section-anchors-vect-69.c: Ditto. * gcc.dg/vect/no-vfa-vect-dv-2.c: Ditto. * gcc.dg/vect/pr45752.c: Ditto. * gcc.dg/vect/slp-perm-4.c: Ditto. * gcc.dg/vect/slp-perm-9.c: Ditto. * gcc.dg/vect/vect-33.c: Ditto. * gcc.dg/vect/vect-35.c: Ditto. * gcc.dg/vect/vect-6-big-array.c: Ditto. * gcc.dg/vect/vect-6.c: Ditto. * gcc.dg/vect/vect-91.c: Ditto. * gcc.dg/vect/vect-all-big-array.c: Ditto. * gcc.dg/vect/vect-all.c: Ditto. * gcc.dg/vect/vect-multitypes-1.c: Ditto. * gcc.dg/vect/vect-outer-4c.c: Ditto. * gcc.dg/vect/vect-outer-5.c: Ditto. * gcc.dg/vect/vect-over-widen-1.c: Ditto. * gcc.dg/vect/vect-over-widen-3.c: Ditto. * gcc.dg/vect/vect-over-widen-4.c: Ditto. * gcc.dg/vect/vect-peel-1.c: Ditto. * gcc.dg/vect/vect-peel-2.c: Ditto. * gcc.dg/vect/vect-peel-3.c: Ditto. * gcc.dg/vect/vect-reduc-pattern-1b.c: Ditto. * gcc.dg/vect/vect-reduc-pattern-1c.c: Ditto. * gcc.dg/vect/vect-reduc-pattern-2b.c: Ditto. * gcc.dg/vect/wrapv-vect-reduc-pattern-2c.c: Ditto. * gcc.dg/vect/no-section-anchors-vect-36.c: Adjust array size to fix fail on AVX. * gcc.dg/vect/no-section-anchors-vect-64.c: Ditto. * lib/target-supports.exp (check_effective_target_vect_any_perm): New function. (check_avx_available): Ditto. (check_effective_target_vect_aligned_arrays): Add handling of AVX. (check_effective_target_vect_multiple_sizes): Ditto. On 12 December 2011 12:32, Uros Bizjak ubiz...@gmail.com wrote: Hello! This patch fixes dg-final scans in tests from vect.exp suite, which currently fail when avx2 is used. --- a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c +++ b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c @@ -88,5 +88,6 @@ int main (void) /* { dg-final { scan-tree-dump-times vectorized 4 loops 1 vect } } */ /* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 0 vect } } */ -/* { dg-final { scan-tree-dump-times Alignment of access forced using peeling 2 vect } } */ +/* { dg-final { scan-tree-dump-times Alignment of access forced using peeling 2 vect { target {! vect_multiple_sizes} } } } */ +/* { dg-final { scan-tree-dump-times Alignment of access forced using peeling 2 vect { xfail vect_multiple_sizes} } } */ /* { dg-final { cleanup-tree-dump vect } } */ Please do not add xfails through the patch, xfail means that a problem was identified and will someday be fixed. In the above case, just add target condition, no need for xfailed scan. If I'm not missing simething, you can probably remove all introduced xfails, just add new target conditions. # Return 1 if avx instructions can be compiled. +proc check_effective_explicit_target_avx { } { + return [check_no_messages_and_pattern e_avx !__builtin_ia32_vzeroall assembly { + void _mm256_zeroall (void) + { + __builtin_ia32_vzeroall (); + } + } -O2 ] +} Please use # Return true if we are compiling for AVX target. proc check_avx_available { } { return [check_no_compiler_messages avx_available assembly { #ifndef __AVX__ #error unsupported #endif } ] } Uros. -- --- Best regards, Michael V. Zolotukhin, Software Engineer Intel Corporation. vec-tests-avx2_fixes-2.patch Description: Binary data
Re: [Patch] Adjust diag-scans in vect-tests to fix fails on AVX/AVX2
On Mon, Dec 12, 2011 at 03:00:52PM +0400, Michael Zolotukhin wrote: I changed xfails to target-checks - for now I use common vect_multiple_sizes (though it'll fail when wider vectors emerge). Also, I changed AVX-check to the version Uros suggested. Please check updated patch (attached). As for vect_multiple_sizes_32B_16B and similar - isn't it too target-specific? I think if we want to keep everything as general as possible, we should have something like vect_1_vector_size_available, vect_2_vector_sizes_available, etc. Depends on the test. For some tests you don't need to distinguish in between vect_multiple_sizes vs. !vect_multiple_sizes at all, the first size will work out. For other tests e.g. none could work out and thus you'd see for each attempted vector sizes similar messages. Then you can e.g. have tests with very small number of iterations that will e.g. work out only for 16-byte vectors and not for 32-byte vectors. In that case it depends on both vect_multiple_sizes vs. !vect_multiple_sizes, the number of sizes attempted, but also which was the first one, with -mprefer-avx128 you get different number of messages from -mavx -mno-prefer-avx128, because in the former case it will not retry with 32-byte vectors, while in the latter case it will start with 32-byte vectors and retry with 16-byte vectors. Jakub
Re: [patch PR libstdc++/51135]: Fix [4.7 Regression] SIGSEGV during exception cleanup on win32
Hi, so as no other review happend, I changed patch as you suggested. Tested for i686-w64-mingw32, and regression tested for x86_64-unknown-linux-gnu. Ok for apply? Regards, Kai ChangeLog 2011-12-12 Kai Tietz kti...@redhat.com PR libstdc++/511135 * libsupc++/cxxabi.h (__cxxabi_dtor_type): New type. (__cxa_throw): Use it for destructor-argument. * eh_throw.cc (__cxa_throw): Likewise. * unwind-cxx.h (__cxa_exception): Change type of member exceptionDestructor to __cxxabi_dtor_type. Index: gcc/libstdc++-v3/libsupc++/cxxabi.h === --- gcc.orig/libstdc++-v3/libsupc++/cxxabi.h +++ gcc/libstdc++-v3/libsupc++/cxxabi.h @@ -51,6 +51,16 @@ #include bits/cxxabi_tweaks.h #include bits/cxxabi_forced.h +// On 32-bit IA native windows target is the used calling-convention +// for class-member-functions of kind __thiscall. As destructor is +// also of kind class-member-function, we need to specify for this +// target proper calling-convention on destructor-function-pointer. +#if defined (__MINGW32__) defined (__i386__) +typedef void (__thiscall *__cxxabi_dtor_type) (void *); +#else +typedef void (*__cxxabi_dtor_type) (void *); +#endif + #ifdef __cplusplus namespace __cxxabiv1 { @@ -596,7 +606,7 @@ namespace __cxxabiv1 // Throw the exception. void - __cxa_throw(void*, std::type_info*, void (*) (void *)) + __cxa_throw(void*, std::type_info*, __cxxabi_dtor_type) __attribute__((__noreturn__)); // Used to implement exception handlers. Index: gcc/libstdc++-v3/libsupc++/eh_throw.cc === --- gcc.orig/libstdc++-v3/libsupc++/eh_throw.cc +++ gcc/libstdc++-v3/libsupc++/eh_throw.cc @@ -58,8 +58,8 @@ __gxx_exception_cleanup (_Unwind_Reason_ extern C void -__cxxabiv1::__cxa_throw (void *obj, std::type_info *tinfo, -void (*dest) (void *)) +__cxxabiv1::__cxa_throw (void *obj, std::type_info *tinfo, +__cxxabi_dtor_type dest) { // Definitely a primary. __cxa_refcounted_exception *header Index: gcc/libstdc++-v3/libsupc++/unwind-cxx.h === --- gcc.orig/libstdc++-v3/libsupc++/unwind-cxx.h +++ gcc/libstdc++-v3/libsupc++/unwind-cxx.h @@ -51,7 +51,7 @@ struct __cxa_exception { // Manage the exception object itself. std::type_info *exceptionType; - void (*exceptionDestructor)(void *); + __cxxabi_dtor_type exceptionDestructor; // The C++ standard has entertaining rules wrt calling set_terminate // and set_unexpected in the middle of the exception cleanup process.
Re: [patch PR libstdc++/51135]: Fix [4.7 Regression] SIGSEGV during exception cleanup on win32
On 12/12/2011 12:11 PM, Kai Tietz wrote: Hi, so as no other review happend, I changed patch as you suggested. Well, sorry for not noticing earlier, but here, as in many other cases, I think it would be much cleaner to have the pre-processor games in the mingw config header, define a macro name there (normally undefined) and then use it here. Paolo.
Re: [patch] add __is_final trait to fix libstdc++/51365
On 12/12/2011, Paolo Carlini wrote: On 12/11/2011 04:05 PM, Jonathan Wakely wrote: ping In my opinion __is_final would be definitely useful in general, for 4.8, and 4.7 too, if isn't too late. As we've got the final keyword in 4.7 I think we really want __is_final in the front end too. As regards the wider issue which is being discussed on the reflector - beware, I didn't follow all the messages - 'final' disabling a nice optimization like EBO makes me very nervous. Really, doesn't seem part of the intended general philosophy in this area. There must be a way to overcome the annoyance. Last resort, if suggestions like having 'final' not forbidding private derivation cannot go through, we could imagine a GCC attribute reverting the effect of 'final' for people (library writers ;) knowing what they are doing. I don't know. I think being able to detect a final class is good enough for now, until we find out if there are real problems being encountered as people make more use of C++11.
Re: [patch] add __is_final trait to fix libstdc++/51365
On 12/12/2011 12:19 PM, Jonathan Wakely wrote: As regards the wider issue which is being discussed on the reflector - beware, I didn't follow all the messages - 'final' disabling a nice optimization like EBO makes me very nervous. Really, doesn't seem part of the intended general philosophy in this area. There must be a way to overcome the annoyance. Last resort, if suggestions like having 'final' not forbidding private derivation cannot go through, we could imagine a GCC attribute reverting the effect of 'final' for people (library writers ;) knowing what they are doing. I don't know. I think being able to detect a final class is good enough for now, until we find out if there are real problems being encountered as people make more use of C++11. Maybe. But in my opinion we should not rush. Something is wrong here at a more fundamental level. Paolo.
Re: [patch PR libstdc++/51135]: Fix [4.7 Regression] SIGSEGV during exception cleanup on win32
2011/12/12 Paolo Carlini paolo.carl...@oracle.com: On 12/12/2011 12:11 PM, Kai Tietz wrote: Hi, so as no other review happend, I changed patch as you suggested. Well, sorry for not noticing earlier, but here, as in many other cases, I think it would be much cleaner to have the pre-processor games in the mingw config header, define a macro name there (normally undefined) and then use it here. Paolo. Well, this was my initial attempt to solve it. The issue here is that libsupc++ doesn't use the the os-header-file and I am not sure if it is wise to introduce it here. To add it to the cxxabi.h header, which claims to reflect ABI issue, looks sensible as alternative to me here. Kai
[Ada] Simplify Get_Target_Prefix in mlib-tgt-specific-xi.adb
Avoid hard-coded constants and simply reuse the prefix for ar and ranlib. No behavioural change. Tested on x86_64-pc-linux-gnu, committed on trunk 2011-12-12 Tristan Gingold ging...@adacore.com * mlib-tgt-specific-xi.adb: (Get_Target_Prefix): Simplify code. Index: mlib-tgt-specific-xi.adb === --- mlib-tgt-specific-xi.adb(revision 182223) +++ mlib-tgt-specific-xi.adb(working copy) @@ -3,11 +3,10 @@ -- GNAT COMPILER COMPONENTS -- -- -- -- M L I B . T G T. S P E C I F I C -- --- (Bare Board Version) -- -- -- -- B o d y -- -- -- --- Copyright (C) 2003-2008, Free Software Foundation, Inc. -- +-- Copyright (C) 2003-2011, Free Software Foundation, Inc. -- -- -- -- GNAT is free software; you can redistribute it and/or modify it under -- -- terms of the GNU General Public License as published by the Free Soft- -- @@ -139,33 +138,11 @@ function Get_Target_Prefix return String is Target_Name : constant String_Ptr := Sdefault.Target_Name; - Index : Positive:= Target_Name'First; begin - while Index Target_Name'Last -and then Target_Name (Index + 1) /= '-' - loop - Index := Index + 1; - end loop; + -- Target_name is the program prefix without '-' but with a trailing '/' - if Target_Name (Target_Name'First .. Index) = avr then - return avr-; - elsif Target_Name (Target_Name'First .. Index) = erc32 then - return erc32-elf-; - elsif Target_Name (Target_Name'First .. Index) = leon then - return leon-elf-; - elsif Target_Name (Target_Name'First .. Index) = powerpc then - if Target_Name'Length = 23 and then - Target_Name (Target_Name'First .. Target_Name'First + 22) = - powerpc-unknown-eabispe - then -return powerpc-eabispe-; - else -return powerpc-elf-; - end if; - else - return ; - end if; + return Target_Name (Target_Name'First .. Target_Name'Last - 1) '-'; end Get_Target_Prefix; --
[Ada] Redefine FD_SETSIZE before including system headers
On some platforms, the sockets support code for the GNAT runtime library needs to redefine C macro FD_SETSIZE to increase its value from the system default. This must occur before any system header file is included, so that all code sees a consistent value. Tested on x86_64-pc-linux-gnu, committed on trunk 2011-12-12 Thomas Quinot qui...@adacore.com * gsocket.h, s-oscons-tmplt.c: Ensure we do not include any system header file prior to redefining FD_SETSIZE. Index: gsocket.h === --- gsocket.h (revision 182223) +++ gsocket.h (working copy) @@ -6,7 +6,7 @@ * * * C Header File * * * - * Copyright (C) 2004-2010, Free Software Foundation, Inc. * + * Copyright (C) 2004-2011, Free Software Foundation, Inc. * * * * GNAT is free software; you can redistribute it and/or modify it under * * terms of the GNU General Public License as published by the Free Soft- * @@ -58,9 +58,12 @@ /* For Tru64 */ #endif -#include limits.h -#include errno.h +/** No system header may be included prior to this point since on some targets + ** we need to redefine FD_SETSIZE. + **/ +/* Target-specific includes and definitions */ + #if defined(__vxworks) #include vxWorks.h #include ioLib.h @@ -163,6 +166,8 @@ #elif defined(VMS) #define FD_SETSIZE 4096 +#include sys/types.h +#include sys/time.h #ifndef IN_RTS /* These DEC C headers are not available when building with GCC */ #include in.h @@ -173,6 +178,9 @@ #endif +#include limits.h +#include errno.h + #if defined (__vxworks) ! defined (__RTP__) #include sys/times.h #else @@ -180,11 +188,11 @@ #endif /* - * RTEMS has these .h files but not until you have built and installed - * RTEMS. When building a C/C++ toolset, you also build the newlib C library. - * So the build procedure for an RTEMS GNAT toolset requires that - * you build a C/C++ toolset, then build and install RTEMS with - * --enable-multilib, and finally build the Ada part of the toolset. + * RTEMS has these .h files but not until you have built and installed RTEMS. + * When building a C/C++ toolset, you also build the newlib C library, so the + * build procedure for an RTEMS GNAT toolset requires that you build a C/C++ + * toolset, then build and install RTEMS with --enable-multilib, and finally + * build the Ada part of the toolset. */ #if !(defined (VMS) || defined (__MINGW32__)) #include sys/socket.h Index: s-oscons-tmplt.c === --- s-oscons-tmplt.c(revision 182223) +++ s-oscons-tmplt.c(working copy) @@ -78,6 +78,8 @@ ** $ RUN xoscons **/ +/* Feature macro definitions */ + #if defined (__linux__) !defined (_XOPEN_SOURCE) /** For Linux _XOPEN_SOURCE must be defined, otherwise IOV_MAX is not defined **/ @@ -93,6 +95,10 @@ #endif #endif +/* Include gsocket.h before any system header so it can redefine FD_SETSIZE */ + +#include gsocket.h + #include stdlib.h #include string.h #include limits.h @@ -130,8 +136,6 @@ # include vxWorks.h #endif -#include gsocket.h - #ifdef DUMMY # if defined (TARGET)
[Ada] Fixed bugs in iterators for vector containers
The First and Last selector functions return a value that depends on whether this is complete or partial iteration. For complete iteration, the selector function returns the logical beginning of the entire sequence of items in the container. (To be specific, Container.First for a forward iterator, and Container.Last for a reverse iterator.) For partial iteration, the selector function returns the start position value specified when the iterator object was constructed (in this case, both First and Last return the same value). The Next and Previous iterator operations vet the cursor parameter to ensure that it designates a node in the same container as the iterator. The function then forwards to the call to the analogous cursor-based operation. Iterate constructs an iterator object whose state indicates whether this is complete or partial iteration. There was also change in the semantics of the partial iterator (per the ARG meeting in Denver on 2011/11): if the start position equals No_Element, then it raises Constraint_Error; otherwise, it constructs an iterator object to indicate the position from which the iteration begins (which is in turn used by the selector functions First and Last). Tested on x86_64-pc-linux-gnu, committed on trunk 2011-12-12 Matthew Heaney hea...@adacore.com * a-convec.adb, a-coinve.adb, a-cobove.adb (Iterator): Use subtype Index_Type'Base for Index component (Finalize): Remove unnecessary access check (First, Last): Cursor return value depends on iterator index value (Iterate): Use start position as iterator index value (Next, Previous): Forward to corresponding cursor-based operation. * a-cborma.adb (Iterate): Properly initialize iterator object (with 0 as node index). Index: a-cborma.adb === --- a-cborma.adb(revision 182223) +++ a-cborma.adb(working copy) @@ -935,7 +935,7 @@ return It : constant Iterator := (Limited_Controlled with Container = Container'Unrestricted_Access, - Node = Container.First) + Node = 0) do B := B + 1; end return; Index: a-cobove.adb === --- a-cobove.adb(revision 182223) +++ a-cobove.adb(working copy) @@ -38,7 +38,7 @@ Vector_Iterator_Interfaces.Reversible_Iterator with record Container : Vector_Access; - Index : Index_Type; + Index : Index_Type'Base; end record; overriding procedure Finalize (Object : in out Iterator); @@ -667,14 +667,9 @@ -- procedure Finalize (Object : in out Iterator) is + B : Natural renames Object.Container.Busy; begin - if Object.Container /= null then - declare -B : Natural renames Object.Container.all.Busy; - begin -B := B - 1; - end; - end if; + B := B - 1; end Finalize; -- @@ -740,10 +735,24 @@ function First (Object : Iterator) return Cursor is begin - if Is_Empty (Object.Container.all) then - return No_Element; + -- The value of the iterator object's Index component influences the + -- behavior of the First (and Last) selector function. + + -- When the Index component is No_Index, this means the iterator object + -- was constructed without a start expression, in which case the + -- (forward) iteration starts from the (logical) beginning of the entire + -- sequence of items (corresponding to Container.First, for a forward + -- iterator). + + -- Otherwise, this is iteration over a partial sequence of items. When + -- the Index component isn't No_Index, the iterator object was + -- constructed with a start expression, that specifies the position from + -- which the (forward) partial iteration begins. + + if Object.Index = No_Index then + return First (Object.Container.all); else - return Cursor'(Object.Container, Index_Type'First); + return Cursor'(Object.Container, Object.Index); end if; end First; @@ -1648,12 +1657,24 @@ (Container : Vector) return Vector_Iterator_Interfaces.Reversible_Iterator'Class is - B : Natural renames Container'Unrestricted_Access.all.Busy; + V : constant Vector_Access := Container'Unrestricted_Access; + B : Natural renames V.Busy; + begin + -- The value of its Index component influences the behavior of the First + -- and Last selector functions of the iterator object. When the Index + -- component is No_Index (as is the case here), this means the iterator + -- object was constructed without a start expression. This is a complete + -- iterator, meaning that the iteration starts from the
Re: [patch PR libstdc++/51135]: Fix [4.7 Regression] SIGSEGV during exception cleanup on win32
On 12/12/2011 12:29 PM, Kai Tietz wrote: Well, this was my initial attempt to solve it. The issue here is that libsupc++ doesn't use the the os-header-file Are you sure? Should include bits/c++config.h, no? Paolo.
Re: [Patch] Adjust diag-scans in vect-tests to fix fails on AVX/AVX2
gcc-patches-ow...@gcc.gnu.org wrote on 12/12/2011 01:00:52 PM: I changed xfails to target-checks - for now I use common vect_multiple_sizes (though it'll fail when wider vectors emerge). Also, I changed AVX-check to the version Uros suggested. Please check updated patch (attached). As for vect_multiple_sizes_32B_16B and similar - isn't it too target-specific? I think if we want to keep everything as general as possible, we should have something like vect_1_vector_size_available, vect_2_vector_sizes_available, etc. I think there is a difference between different vector sizes, and calling it vect_X_vector_size_available is not sufficient. Your patch will cause failures on ARM. It has two vector sizes, 16 and 8 bytes. E.g., vect-33.c gets vectorized with the default vector size, and the alignment message should be printed only once, and not twice as with your patch. So, it looks like you need several vect_multiple_sizes_X. Ira
Re: [patch] Fix PR tree-optimization/50569
Well, I can live with this change (though I cannot approve anything). On the other hand, the real underlying problem is that expander cannot handle unaligned MEM_REFs where strict alignment is required. SRA is of course much more prone to create such situations than anything else but I wonder whether they can creep up elsewhere too. It also takes us in the opposite direction than the one initially intended with MEM_REFs, doesn't it? Certainly, but we need to fix the regression in a relatively safe manner. That said, I looked into the expander briefly in summer but given my level of experience in that area I did not nearly have enough time. I still plan to look into this issue in expander but for the same reasons I cannot guarantee any quick success. So I acknowledge this is the only working approach to a long-standing difficult bug... and most probably the most appropriate for the 4.6 branch. Thanks. This is still the same very old issue: misalignment cannot be handled indirectly (because we don't really have misaligned pointers) so MEM_REFs can be used safely only when everything is properly aligned. However, since we have them, shouldn't we use stack-based vectors to handle the stack of COMPONENT_REFs? Indeed, it will make the change before installing. -- Eric Botcazou
Re: [patch PR libstdc++/51135]: Fix [4.7 Regression] SIGSEGV during exception cleanup on win32
2011/12/12 Paolo Carlini paolo.carl...@oracle.com: On 12/12/2011 12:29 PM, Kai Tietz wrote: Well, this was my initial attempt to solve it. The issue here is that libsupc++ doesn't use the the os-header-file Are you sure? Should include bits/c++config.h, no? Paolo. Well, I tested it, and I saw that the define in the os part for libsupc++ weren't set. By looking into this, this might be also caused by not including in all .cc files using cxxabi.h before bits/c++config.h. Kai
[Ada] Always get an existing declared object/exec directory
If an object and/or exec directory exists and is declared in a project with no source, it was not taken into account. This patch correct this. Tested on x86_64-pc-linux-gnu, committed on trunk 2011-12-12 Vincent Celier cel...@adacore.com * prj-nmsc.adb (Get_Directories): For a non extending project, always get a declared object and/or exec directory if it already exists, even when there are no sources, but do not create them. Index: prj-nmsc.adb === --- prj-nmsc.adb(revision 182223) +++ prj-nmsc.adb(working copy) @@ -5284,8 +5284,24 @@ Object_Dir cannot be empty, Object_Dir.Location, Project); - elsif not No_Sources then + elsif Setup_Projects and then + No_Sources and then + Project.Extends = No_Project + then +-- Do not create an object directory for a non extending project +-- with no sources. +Locate_Directory + (Project, + File_Name_Type (Object_Dir.Value), + Path = Project.Object_Directory, + Dir_Exists = Dir_Exists, + Data = Data, + Location = Object_Dir.Location, + Must_Exist = False, + Externally_Built = Project.Externally_Built); + + else -- We check that the specified object directory does exist. -- However, even when it doesn't exist, we set it to a default -- value. This is for the benefit of tools that recover from @@ -5355,8 +5371,23 @@ Exec_Dir cannot be empty, Exec_Dir.Location, Project); - elsif not No_Sources then + elsif Setup_Projects and then + No_Sources and then + Project.Extends = No_Project + then +-- Do not create an exec directory for a non extending project +-- with no sources. +Locate_Directory + (Project, + File_Name_Type (Exec_Dir.Value), + Path = Project.Exec_Directory, + Dir_Exists = Dir_Exists, + Data = Data, + Location = Exec_Dir.Location, + Externally_Built = Project.Externally_Built); + + else -- We check that the specified exec directory does exist Locate_Directory
[Ada] Illegal call on abstract operator
This patch fixes an obscure bug where gnat was failing to detect an illegal call on an abstract operator. In particular, when the operands are of a universal numeric type. This bug occurred only in Ada 2005 mode (and higher). The following test should get an error: illegal_abst_func.adb:5:24: cannot call abstract subprogram + procedure Illegal_Abst_Func is type My_Integer is new Integer; function + (Left, Right: My_Integer) return My_Integer is abstract; X : My_Integer := 2 + 2; -- Illegal! begin null; end Illegal_Abst_Func; Tested on x86_64-pc-linux-gnu, committed on trunk 2011-12-12 Bob Duff d...@adacore.com * sem_res.adb (Resolve): Deal with the case where an abstract operator is called with operands of type universal_integer. Index: sem_res.adb === --- sem_res.adb (revision 182223) +++ sem_res.adb (working copy) @@ -1989,6 +1989,9 @@ end if; Debug_A_Entry (resolving , N); + if Debug_Flag_V then + Write_Overloads (N); + end if; if Comes_From_Source (N) then if Is_Fixed_Point_Type (Typ) then @@ -2033,6 +2036,11 @@ Get_First_Interp (N, I, It); Interp_Loop : while Present (It.Typ) loop +if Debug_Flag_V then + Write_Str (Interp: ); + Write_Interp (It); +end if; + -- We are only interested in interpretations that are compatible -- with the expected type, any other interpretations are ignored. @@ -2054,6 +2062,10 @@ and then Typ /= Universal_Real and then Present (It.Abstract_Op) then + if Debug_Flag_V then + Write_Line (Skip.); + end if; + goto Continue; end if; @@ -2572,9 +2584,36 @@ Resolution_Failed; return; - -- Here we have an acceptable interpretation for the context + else + -- In Ada 2005, if we have something like X : T := 2 + 2;, where + -- the + on T is abstract, and the operands are of universal type, + -- the above code will have (incorrectly) resolved the + to the + -- universal one in Standard. Therefore, we check for this case, and + -- give an error. We can't do this earlier, because it would cause + -- legal cases to get errors (when some other type has an abstract + -- +). - else + if Ada_Version = Ada_2005 and then + Nkind (N) in N_Op and then + Is_Overloaded (N) and then + Is_Universal_Numeric_Type (Etype (Entity (N))) + then +Get_First_Interp (N, I, It); +while Present (It.Typ) loop + if Present (It.Abstract_Op) and then + Etype (It.Abstract_Op) = Typ + then + Error_Msg_NE +(cannot call abstract subprogram !, N, It.Abstract_Op); + return; + end if; + + Get_Next_Interp (I, It); +end loop; + end if; + + -- Here we have an acceptable interpretation for the context + -- Propagate type information and normalize tree for various -- predefined operations. If the context only imposes a class of -- types, rather than a specific type, propagate the actual type
Re: [patch PR libstdc++/51135]: Fix [4.7 Regression] SIGSEGV during exception cleanup on win32
2011/12/12 Kai Tietz ktiet...@googlemail.com: 2011/12/12 Paolo Carlini paolo.carl...@oracle.com: On 12/12/2011 12:29 PM, Kai Tietz wrote: Well, this was my initial attempt to solve it. The issue here is that libsupc++ doesn't use the the os-header-file Are you sure? Should include bits/c++config.h, no? Paolo. Well, I tested it, and I saw that the define in the os part for libsupc++ weren't set. By looking into this, this might be also caused by not including in all .cc files using cxxabi.h before bits/c++config.h. Kai Hmm, strange. cxxabi.h is supposed to include bits/c++config.h itself. I will retest Kai
Re: [patch PR libstdc++/51135]: Fix [4.7 Regression] SIGSEGV during exception cleanup on win32
On 12 December 2011 11:29, Kai Tietz ktiet...@googlemail.com wrote: 2011/12/12 Paolo Carlini paolo.carl...@oracle.com: On 12/12/2011 12:11 PM, Kai Tietz wrote: Hi, so as no other review happend, I changed patch as you suggested. Well, sorry for not noticing earlier, but here, as in many other cases, I think it would be much cleaner to have the pre-processor games in the mingw config header, define a macro name there (normally undefined) and then use it here. Paolo. Well, this was my initial attempt to solve it. The issue here is that libsupc++ doesn't use the the os-header-file and I am not sure if it is wise to introduce it here. To add it to the cxxabi.h header, which claims to reflect ABI issue, looks sensible as alternative to me here. I think Paolo means: #ifdef _GLIBCXX_USE_THISCALL_ON_DTOR typedef void (__thiscall *__cxxabi_dtor_type) (void *); #else typedef void (*__cxxabi_dtor_type) (void *); #endif instead of testing __MINGW32__ and __i386__ Also, my suggestion of __cxxabi_dtor_type would be more consistent it was spelled __cxa not __cxxabi (sorry, it was just a quick suggestion, not a request to actually use that name!)
Re: warn about deprecated access declarations
On 12 December 2011 10:08, Andreas Schwab wrote: Jonathan Wakely jwakely@gmail.com writes: On 12 December 2011 09:18, Andreas Schwab wrote: Jonathan Wakely jwakely@gmail.com writes: On 11 December 2011 22:22, Fabien Chêne wrote: Consequently, I propose to deprecate them with a warning, as clang already does. So that you get a warning for the following code: struct A { int i; }; struct B : A { A::i; // - warning here }; warning: access declarations are deprecated; employ using declarations instead [-Wdeprecated] Whether or not it's suitable for stage 3, employ feels a bit clunky in this context, how about access declarations are deprecated in favour of using-declarations ? How about ...; suggest adding the using keyword? That sounds like the compiler is suggesting that the user suggests doing that! It is similar to suggest parentheses Good point, that's not correct English either, but it would be consistent. (Suggest X is an imperative, telling the user to suggest X. The intention is for the compiler to suggest it, not tell the user to suggest it, so the correct grammar would be GCC suggests X.)
Re: [patch PR libstdc++/51135]: Fix [4.7 Regression] SIGSEGV during exception cleanup on win32
On 12/12/2011 12:50 PM, Jonathan Wakely wrote: I think Paolo means: #ifdef _GLIBCXX_USE_THISCALL_ON_DTOR typedef void (__thiscall *__cxxabi_dtor_type) (void *); #else typedef void (*__cxxabi_dtor_type) (void *); #endif instead of testing __MINGW32__ and __i386__ This for sure, but I think we could as well move the whole thing in the config file, like: #ifndef _GLIBCXX_USE_THISCALL_ON_DTOR typedef void (*__cxxabi_dtor_type) (void *); #endif Paolo.
Re: [Patch] Adjust diag-scans in vect-tests to fix fails on AVX/AVX2
I think there is a difference between different vector sizes, and calling it vect_X_vector_size_available is not sufficient. Your patch will cause failures on ARM. It has two vector sizes, 16 and 8 bytes. E.g., vect-33.c gets vectorized with the default vector size, and the alignment message should be printed only once, and not twice as with your patch. So, it looks like you need several vect_multiple_sizes_X. Probably we really need to have something like vect_multiple_sizes_32B_16B - it'll help in this test. In this case we could check if specific size is available, and it's crucial when array sizes are so small, that they define the vector width to be used, vect_multiple_sizes isn't sufficient for this, because the specific size matters. I'll prepare a patch with such changes. By the way, how could we check if '-mprefer-avx128' was specified from target-supports.exp? Is there any global-variable for command line options or something similar? On 12 December 2011 15:36, Ira Rosen i...@il.ibm.com wrote: gcc-patches-ow...@gcc.gnu.org wrote on 12/12/2011 01:00:52 PM: I changed xfails to target-checks - for now I use common vect_multiple_sizes (though it'll fail when wider vectors emerge). Also, I changed AVX-check to the version Uros suggested. Please check updated patch (attached). As for vect_multiple_sizes_32B_16B and similar - isn't it too target-specific? I think if we want to keep everything as general as possible, we should have something like vect_1_vector_size_available, vect_2_vector_sizes_available, etc. I think there is a difference between different vector sizes, and calling it vect_X_vector_size_available is not sufficient. Your patch will cause failures on ARM. It has two vector sizes, 16 and 8 bytes. E.g., vect-33.c gets vectorized with the default vector size, and the alignment message should be printed only once, and not twice as with your patch. So, it looks like you need several vect_multiple_sizes_X. Ira -- --- Best regards, Michael V. Zolotukhin, Software Engineer Intel Corporation.
Re: Memset/memcpy patch
Any update? On 5 December 2011 15:14, Michael Zolotukhin michael.v.zolotuk...@gmail.com wrote: Hi Jan, I debugged the changes, and I think I've hunted down all the bugs. I slightly refactored the code - now all new SSE-related code is more localized. Also, I fixed some alignment issues. Please find the new patch in the attachment (it's made against rev 181709) - is it ok for trunk? Bootstrap and 'make check' passed on Atom and Corei7 (32,64 bits). I also checked specs2000, eembc1_1 and eembc2_0 on Atom. On 26 November 2011 09:18, Jan Hubicka hubi...@ucw.cz wrote: On Wed, Nov 23, 2011 at 3:32 PM, Michael Zolotukhin michael.v.zolotuk...@gmail.com wrote: I found and fixed another problem in the latest memcpy/memest changes - with this fix all the failing tests mentioned in #51134 started passing. Bootstraps are also ok. Though I still see fails in 32-bit make check, so probably, it'd be better to revert the changes till these fails are fixed. I will revert it for now. OK. I guess I can break out the simple fixes and commit them for 4.7 and we could revisit this for next stage1. Probably not by adding all the features together, but extending prologues/epilogues first and adding SSE loops with the new alignment logic next. Honza -- H.J. -- --- Best regards, Michael V. Zolotukhin, Software Engineer Intel Corporation. -- --- Best regards, Michael V. Zolotukhin, Software Engineer Intel Corporation.
Re: [patch PR libstdc++/51135]: Fix [4.7 Regression] SIGSEGV during exception cleanup on win32
2011/12/12 Paolo Carlini paolo.carl...@oracle.com: On 12/12/2011 12:50 PM, Jonathan Wakely wrote: I think Paolo means: #ifdef _GLIBCXX_USE_THISCALL_ON_DTOR typedef void (__thiscall *__cxxabi_dtor_type) (void *); #else typedef void (*__cxxabi_dtor_type) (void *); #endif instead of testing __MINGW32__ and __i386__ This for sure, but I think we could as well move the whole thing in the config file, like: #ifndef _GLIBCXX_USE_THISCALL_ON_DTOR typedef void (*__cxxabi_dtor_type) (void *); #endif Paolo. Fine, nevertheless the test in os-config file for __i386__ is required, as just for IA 32-bit this calling convention is for interest. Neither x64 nor ARM etc requires it. Kai
[Ada] Store the value of 'alignment of tagged types in the TSD
This patch removes primitive 'alignment to tagged types. This value is now stored in the Type Specific Data record associated with each tagged type since it is information known at compile-time. Tested on x86_64-pc-linux-gnu, committed on trunk 2011-12-12 Javier Miranda mira...@adacore.com * a-tags.ads (Alignment): New TSD field. (Max_Predef_Prims): Value lowered to 15 (or 9 in case of configurable runtime) Update documentation of predefined primitives since Alignment has been removed. * exp_disp.ads Update documentation of slots of dispatching primitives. * exp_disp.adb (Default_Prim_Op_Position): Update slot values since alignment is no longer a predefined primitive. (Is_Predefined_Dispatch_Operation): Remove _alignment. (Is_Predefined_Internal_Operation): Remove _alignment. (Make_DT): Update static test on the value stored in a-tags.ads for Max_Predef_Prims; store the value of 'alignment in the TSD. * exp_atag.ads, exp_atag.adb (Build_Get_Alignment): New subprogram that retrieves the alignment from the TSD * exp_util.adb (Build_Allocated_Deallocate_Proc): For deallocation of class-wide types obtain the value of alignment from the TSD. * exp_attr.adb (Expand_N_Attribute_Reference): For 'alignment applied to a class-wide type invoke Build_Get_Alignment to generate code which retrieves the value of the alignment from the TSD. * rtsfind.ads (RE_Alignment): New Ada.Tags entity * sem_ch13.adb (Analyze_Attribute_Definition_Clause): For tagged types if the value of the alignment is bigger than the Maximum alignment then set the value of the alignment to the Maximum alignment and report a warning. * exp_ch3.adb (Make_Predefined_Primitive_Specs): Do not generate spec of _alignment. (Predefined_Primitive_Bodies): Do not generate body of _alignment. Index: exp_atag.adb === --- exp_atag.adb(revision 182223) +++ exp_atag.adb(working copy) @@ -289,6 +289,25 @@ (RTE_Record_Component (RE_Access_Level), Loc)); end Build_Get_Access_Level; + - + -- Build_Get_Alignment -- + - + + function Build_Get_Alignment + (Loc : Source_Ptr; + Tag_Node : Node_Id) return Node_Id + is + begin + return +Make_Selected_Component (Loc, + Prefix = +Build_TSD (Loc, + Unchecked_Convert_To (RTE (RE_Address), Tag_Node)), + Selector_Name = +New_Reference_To + (RTE_Record_Component (RE_Alignment), Loc)); + end Build_Get_Alignment; + -- -- Build_Get_Predefined_Prim_Op_Address -- -- Index: exp_atag.ads === --- exp_atag.ads(revision 182223) +++ exp_atag.ads(working copy) @@ -66,6 +66,13 @@ -- -- Generates: TSD (Tag).Access_Level + function Build_Get_Alignment + (Loc : Source_Ptr; + Tag_Node : Node_Id) return Node_Id; + -- Build code that retrieves the alignment of the tagged type. + -- + -- Generates: TSD (Tag).Alignment + procedure Build_Get_Predefined_Prim_Op_Address (Loc : Source_Ptr; Position : Uint; Index: exp_util.adb === --- exp_util.adb(revision 182223) +++ exp_util.adb(working copy) @@ -755,8 +755,33 @@ Append_To (Actuals, New_Reference_To (Addr_Id, Loc)); Append_To (Actuals, New_Reference_To (Size_Id, Loc)); - Append_To (Actuals, New_Reference_To (Alig_Id, Loc)); + if Is_Allocate + or else not Is_Class_Wide_Type (Desig_Typ) + then +Append_To (Actuals, New_Reference_To (Alig_Id, Loc)); + + -- For deallocation of class wide types we obtain the value of + -- alignment from the Type Specific Record of the deallocated object. + -- This is needed because the frontend expansion of class-wide types + -- into equivalent types confuses the backend. + + else +-- Generate: +-- Obj.all'Alignment + +-- ... because 'Alignment applied to class-wide types is expanded +-- into the code that reads the value of alignment from the TSD +-- (see Expand_N_Attribute_Reference) + +Append_To (Actuals, + Unchecked_Convert_To (RTE (RE_Storage_Offset), +Make_Attribute_Reference (Loc, + Prefix = +Make_Explicit_Dereference (Loc, Relocate_Node (Expr)), + Attribute_Name = Name_Alignment))); + end if; +
Re: [Patch] Adjust diag-scans in vect-tests to fix fails on AVX/AVX2
On Mon, Dec 12, 2011 at 03:57:09PM +0400, Michael Zolotukhin wrote: By the way, how could we check if '-mprefer-avx128' was specified from target-supports.exp? Is there any global-variable for command line options or something similar? I'd say try some very simple vectorized loop and check how it has been vectorized. Whether using %xmm or %ymm vectors. Say int a[1024], b[1024], c[1024]; void foo (void) { int i; for (i = 0; i 1024; i++) a[i] = b[i] + c[i]; } or so. Jakub
[Ada] Visibility in expression functions
If the expression function is not a completion, the usage names in the expression must be determined at the point of declaration, even though the generated body is inserted at the end of the current declaration list or package to prevent early freezing. The following must be rejected with: forward_reference.ads:2:35: F2 is undefined --- package Forward_Reference is function F1 return Boolean is (F2);-- Error: forward reference function F2 return Boolean is (True); end Forward_Reference; Tested on x86_64-pc-linux-gnu, committed on trunk 2011-12-12 Ed Schonberg schonb...@adacore.com * sem_ch6.adb (Analyze_Expression_Function): If the function is not a completion, pre-analyze the expression now to prevent spurious visibility on later entities. The body is inserted at the end of the current declaration list or package to prevent early freezing, but the visibility is established at the point of definition. Index: sem_ch6.adb === --- sem_ch6.adb (revision 182230) +++ sem_ch6.adb (working copy) @@ -281,6 +281,7 @@ New_Body : Node_Id; New_Decl : Node_Id; New_Spec : Node_Id; + Ret : Node_Id; begin -- This is one of the occasions on which we transform the tree during @@ -302,15 +303,15 @@ Prev := Find_Corresponding_Spec (N); end if; + Ret := Make_Simple_Return_Statement (LocX, Expression (N)); + New_Body := Make_Subprogram_Body (Loc, Specification = New_Spec, Declarations = Empty_List, Handled_Statement_Sequence = Make_Handled_Sequence_Of_Statements (LocX, - Statements = New_List ( -Make_Simple_Return_Statement (LocX, - Expression = Expression (N); + Statements = New_List (Ret))); if Present (Prev) and then Ekind (Prev) = E_Generic_Function then @@ -362,10 +363,13 @@ -- To prevent premature freeze action, insert the new body at the end -- of the current declarations, or at the end of the package spec. + -- However, resolve usage names now, to prevent spurious visibility + -- on later entities. declare Decls : List_Id := List_Containing (N); Par : constant Node_Id := Parent (Decls); +Id: constant Entity_Id := Defining_Entity (New_Decl); begin if Nkind (Par) = N_Package_Specification @@ -377,6 +381,11 @@ end if; Insert_After (Last (Decls), New_Body); +Push_Scope (Id); +Install_Formals (Id); +Preanalyze_Spec_Expression (Expression (Ret), Etype (Id)); +End_Scope; + end; end if;
Re: [patch PR libstdc++/51135]: Fix [4.7 Regression] SIGSEGV during exception cleanup on win32
On 12/12/2011 12:58 PM, Kai Tietz wrote: Fine, nevertheless the test in os-config file for __i386__ is required, as just for IA 32-bit this calling convention is for interest. Neither x64 nor ARM etc requires it. So - just out of curiosity, ultimately you are responsible for the config files - why we do have separate mingw32 and mingw32-w64 configs? Paolo.
Re: [Patch] Adjust diag-scans in vect-tests to fix fails on AVX/AVX2
Michael Zolotukhin michael.v.zolotuk...@gmail.com wrote on 12/12/2011 01:57:09 PM: By the way, how could we check if '-mprefer-avx128' was specified from target-supports.exp? If I understand your question correctly, you can use check-flags (see check_effective_target_arm_fp16_ok_nocache for example). Is there any global-variable for command line options or something similar? flags Ira
Re: [patch PR libstdc++/51135]: Fix [4.7 Regression] SIGSEGV during exception cleanup on win32
2011/12/12 Paolo Carlini paolo.carl...@oracle.com: On 12/12/2011 12:58 PM, Kai Tietz wrote: Fine, nevertheless the test in os-config file for __i386__ is required, as just for IA 32-bit this calling convention is for interest. Neither x64 nor ARM etc requires it. So - just out of curiosity, ultimately you are responsible for the config files - why we do have separate mingw32 and mingw32-w64 configs? Paolo. Well, this is mainly caused by different feature-set of runtimes. We could solve things here also by probing for specific runtimes and so using just on configure tree. Kai
Re: [Patch] Adjust diag-scans in vect-tests to fix fails on AVX/AVX2
On Mon, Dec 12, 2011 at 02:16:04PM +0200, Ira Rosen wrote: Michael Zolotukhin michael.v.zolotuk...@gmail.com wrote on 12/12/2011 01:57:09 PM: By the way, how could we check if '-mprefer-avx128' was specified from target-supports.exp? If I understand your question correctly, you can use check-flags (see check_effective_target_arm_fp16_ok_nocache for example). The problem is that -mprefer-avx128/-mno-prefer-avx128 has a default determined by -march/-mtune, some CPU tuning turn on -mprefer-avx128 by default. Jakub
Re: [patch PR libstdc++/51135]: Fix [4.7 Regression] SIGSEGV during exception cleanup on win32
On 12/12/2011 01:19 PM, Kai Tietz wrote: Well, this is mainly caused by different feature-set of runtimes. We could solve things here also by probing for specific runtimes and so using just on configure tree. Ah, thus, it's *not* true that mingw32, at variance with mingw32-w64, is only used for i386? Anyway, as I said already, in the config files you can check all the macros you like ;) Paolo.
Re: [patch PR libstdc++/51135]: Fix [4.7 Regression] SIGSEGV during exception cleanup on win32
2011/12/12 Paolo Carlini paolo.carl...@oracle.com: On 12/12/2011 01:19 PM, Kai Tietz wrote: Well, this is mainly caused by different feature-set of runtimes. We could solve things here also by probing for specific runtimes and so using just on configure tree. Ah, thus, it's *not* true that mingw32, at variance with mingw32-w64, is only used for i386? Anyway, as I said already, in the config files you can check all the macros you like ;) Paolo. No, mingw32 (the mingw.org variant) is used for IA 32-bit Mingw-w64 allows additionally to do a build in compatible-mode to mingw.org (by using -pc- as vendor-key in triplet), so there is actually also a variant for x64 present for it, too. For multilib it is required to check in target's config for the __i386__ target. So updated patch is: ChangeLog 2011-12-12 Kai Tietz kti...@redhat.com PR libstdc++/511135 * libsupc++/cxxabi.h (__cxxabi_dtor_type): New type. (__cxa_throw): Use it for destructor-argument. * libsupc++/eh_throw.cc (__cxa_throw): Likewise. * libsupc++/unwind-cxx.h (__cxa_exception): Change type of member exceptionDestructor to __cxxabi_dtor_type. * config/os/mingw32-w64/os_defines.h (_GLIBCXX_USE_THISCALL_ON_DTOR): Define. (__cxa_dtor_type): Declare target secific type variant. * config/os/mingw32/os_defines.h: Likewise. Index: gcc/libstdc++-v3/libsupc++/cxxabi.h === --- gcc.orig/libstdc++-v3/libsupc++/cxxabi.h +++ gcc/libstdc++-v3/libsupc++/cxxabi.h @@ -51,6 +51,10 @@ #include bits/cxxabi_tweaks.h #include bits/cxxabi_forced.h +#ifndef _GLIBCXX_USE_THISCALL_ON_DTOR +typedef void (*__cxa_dtor_type) (void *); +#endif + #ifdef __cplusplus namespace __cxxabiv1 { @@ -596,7 +600,7 @@ namespace __cxxabiv1 // Throw the exception. void - __cxa_throw(void*, std::type_info*, void (*) (void *)) + __cxa_throw(void*, std::type_info*, __cxa_dtor_type) __attribute__((__noreturn__)); // Used to implement exception handlers. Index: gcc/libstdc++-v3/libsupc++/eh_throw.cc === --- gcc.orig/libstdc++-v3/libsupc++/eh_throw.cc +++ gcc/libstdc++-v3/libsupc++/eh_throw.cc @@ -58,8 +58,8 @@ __gxx_exception_cleanup (_Unwind_Reason_ extern C void -__cxxabiv1::__cxa_throw (void *obj, std::type_info *tinfo, -void (*dest) (void *)) +__cxxabiv1::__cxa_throw (void *obj, std::type_info *tinfo, +__cxa_dtor_type dest) { // Definitely a primary. __cxa_refcounted_exception *header Index: gcc/libstdc++-v3/libsupc++/unwind-cxx.h === --- gcc.orig/libstdc++-v3/libsupc++/unwind-cxx.h +++ gcc/libstdc++-v3/libsupc++/unwind-cxx.h @@ -51,7 +51,7 @@ struct __cxa_exception { // Manage the exception object itself. std::type_info *exceptionType; - void (*exceptionDestructor)(void *); + __cxa_dtor_type exceptionDestructor; // The C++ standard has entertaining rules wrt calling set_terminate // and set_unexpected in the middle of the exception cleanup process. Index: gcc/libstdc++-v3/config/os/mingw32-w64/os_defines.h === --- gcc.orig/libstdc++-v3/config/os/mingw32-w64/os_defines.h +++ gcc/libstdc++-v3/config/os/mingw32-w64/os_defines.h @@ -65,4 +65,9 @@ // ioctlsocket function doesn't work for normal file-descriptors. #define _GLIBCXX_NO_IOCTL 1 +#if defined (__i386__) +#define _GLIBCXX_USE_THISCALL_ON_DTOR 1 +typedef void (__thiscall *__cxa_dtor_type) (void *); +#endif + #endif Index: gcc/libstdc++-v3/config/os/mingw32/os_defines.h === --- gcc.orig/libstdc++-v3/config/os/mingw32/os_defines.h +++ gcc/libstdc++-v3/config/os/mingw32/os_defines.h @@ -65,4 +65,9 @@ // ioctlsocket function doesn't work for normal file-descriptors. #define _GLIBCXX_NO_IOCTL 1 +#if defined (__i386__) +#define _GLIBCXX_USE_THISCALL_ON_DTOR 1 +typedef void (__thiscall *__cxa_dtor_type) (void *); +#endif + #endif (retested) Regards, Kai
Re: [patch PR libstdc++/51135]: Fix [4.7 Regression] SIGSEGV during exception cleanup on win32
Hi, So updated patch is: Looks good to me. I guess that in principle we could try to have a macro which is the typedef itself, but what you tested seems good enough to resolve the PR. Thanks, Paolo.
Re: [google] Add support for delete operator that takes the size of the object as a parameter
On Sun, Dec 11, 2011 at 19:05, Easwaran Raman era...@google.com wrote: Bootstraps and no test regressions. OK for google/gcc-4_6 branch? Any reason not to put this in google/main for future trunk inclusion. Should this be backported to gcc-4_6-branch? Diego.
Re: [patch PR libstdc++/51135]: Fix [4.7 Regression] SIGSEGV during exception cleanup on win32
PS: remember to adjust the Copyright years of eh_throw.cc and unwind-cxx.h. I would also mention the PR in the os_* files. Paolo.
Re: [google] Add support for delete operator that takes the size of the object as a parameter
On 12/12/2011 02:14 PM, Diego Novillo wrote: On Sun, Dec 11, 2011 at 19:05, Easwaran Ramanera...@google.com wrote: Bootstraps and no test regressions. OK for google/gcc-4_6 branch? Any reason not to put this in google/main for future trunk inclusion. Should this be backported to gcc-4_6-branch? Note that backporting the patch as-is to 4_6-branch would be very wrong in terms of ABI (in mainline we already have a 3.4.17) Paolo.
Re: [google] Add support for delete operator that takes the size of the object as a parameter
On Mon, Dec 12, 2011 at 08:17, Paolo Carlini paolo.carl...@oracle.com wrote: On 12/12/2011 02:14 PM, Diego Novillo wrote: On Sun, Dec 11, 2011 at 19:05, Easwaran Ramanera...@google.com wrote: Bootstraps and no test regressions. OK for google/gcc-4_6 branch? Any reason not to put this in google/main for future trunk inclusion. Should this be backported to gcc-4_6-branch? Note that backporting the patch as-is to 4_6-branch would be very wrong in terms of ABI (in mainline we already have a 3.4.17) Ah, right. I missed the ABI implications. Thanks. Diego.
Re: [patch PR libstdc++/51135]: Fix [4.7 Regression] SIGSEGV during exception cleanup on win32
2011/12/12 Paolo Carlini paolo.carl...@oracle.com: PS: remember to adjust the Copyright years of eh_throw.cc and unwind-cxx.h. I would also mention the PR in the os_* files. Paolo. Thanks for the reminder about copyright. Added comment about pr and added copyright year to files not mentioning 2011. Applied at revision 182237. Kai
Re: [google] Add support for delete operator that takes the size of the object as a parameter
On 12/12/2011 02:21 PM, Diego Novillo wrote: Ah, right. I missed the ABI implications. For possible inclusion in mainline too, things don't seem completely ok: nothing should be added to the baseline and very likely the export should be adjusted to accommodate for different size_t on the various targets, by using [] in the pattern. See, eg, the existing operator new[](size_t). Paolo.
Re: [patch] Fix PR tree-optimization/50569
On Mon, Dec 12, 2011 at 12:40 PM, Eric Botcazou ebotca...@adacore.com wrote: Well, I can live with this change (though I cannot approve anything). On the other hand, the real underlying problem is that expander cannot handle unaligned MEM_REFs where strict alignment is required. SRA is of course much more prone to create such situations than anything else but I wonder whether they can creep up elsewhere too. It also takes us in the opposite direction than the one initially intended with MEM_REFs, doesn't it? Certainly, but we need to fix the regression in a relatively safe manner. That said, I looked into the expander briefly in summer but given my level of experience in that area I did not nearly have enough time. I still plan to look into this issue in expander but for the same reasons I cannot guarantee any quick success. So I acknowledge this is the only working approach to a long-standing difficult bug... and most probably the most appropriate for the 4.6 branch. Thanks. This is still the same very old issue: misalignment cannot be handled indirectly (because we don't really have misaligned pointers) so MEM_REFs can be used safely only when everything is properly aligned. We do have misaligned accesses - TYPE_ALIGN of TREE_TYPE of the MEM_REF reflects that. Similar to how would do typedef int myint __attribute__((aligned(1))); int foo (myint *p) { return *p; } which is a testcase that is miscompiled since forever on STRICT_ALIGNMENT targets (well, maybe apart from now for those who implement movmisalign). The fix is to fix the above testcase (which is a good idea anyway) and then to make sure to transition misaligned information to TREE_TYPE of the MEM_REF we create. Richard.
Fix flags for edges from/to entry/exit basic blocks (issue5486043)
Fix flags for edges from/to entry/exit basic blocks. W/o this patch I hit internal asserts when trying to split the edge from entry block. Index: gcc/cgraphunit.c === --- gcc/cgraphunit.c(revision 182237) +++ gcc/cgraphunit.c(working copy) @@ -1459,8 +1459,8 @@ /* Create BB for body of the function and connect it properly. */ bb = create_basic_block (NULL, (void *) 0, ENTRY_BLOCK_PTR); - make_edge (ENTRY_BLOCK_PTR, bb, 0); - make_edge (bb, EXIT_BLOCK_PTR, 0); + make_edge (ENTRY_BLOCK_PTR, bb, EDGE_FALLTHRU); + make_edge (bb, EXIT_BLOCK_PTR, EDGE_FALLTHRU); return bb; } Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 182237) +++ gcc/ChangeLog (working copy) @@ -1,3 +1,8 @@ +2011-12-12 Dmitry Vyukov dvyu...@google.com + + * cgraphunit.c (init_lowered_empty_function): + Fix flags for new edges. + 2011-12-12 Torvald Riegel trie...@redhat.com * gimplify.c (voidify_wrapper_expr): Add default handling for -- This patch is available for review at http://codereview.appspot.com/5486043
[Ada] Speed up 'Count attribute on Windows
On Windows, the 'Count attribute was very slow in Annex D mode. This patch fixes that efficiency problem. Annex D mode is invoked if there is a pragma Task_Dispatching_Policy (FIFO_Within_Priorities). Tested on i686-pc-mingw, committed on trunk 2011-12-12 Bob Duff d...@adacore.com * s-taprop-mingw.adb (Yield): Do not delay 1 millisecond in Annex D mode. Index: s-taprop-mingw.adb === --- s-taprop-mingw.adb (revision 182223) +++ s-taprop-mingw.adb (working copy) @@ -126,9 +126,6 @@ Foreign_Task_Elaborated : aliased Boolean := True; -- Used to identified fake tasks (i.e., non-Ada Threads) - Annex_D : Boolean := False; - -- Set to True if running with Annex-D semantics - Null_Thread_Id : constant Thread_Id := 0; -- Constant to indicate that the thread identifier has not yet been -- initialized. @@ -700,20 +697,9 @@ --- procedure Yield (Do_Yield : Boolean := True) is + pragma Unreferenced (Do_Yield); begin - if Do_Yield then - SwitchToThread; - - elsif Annex_D then - -- If running with Annex-D semantics we need a delay - -- above 0 milliseconds here otherwise processes give - -- enough time to the other tasks to have a chance to - -- run. - -- - -- This makes cxd8002 ACATS pass on Windows. - - Sleep (1); - end if; + SwitchToThread; end Yield; -- @@ -1076,8 +1062,6 @@ Discard := OS_Interface.SetPriorityClass (GetCurrentProcess, Realtime_Priority_Class); - - Annex_D := True; end if; TlsIndex := TlsAlloc;
Re: [patch PR libstdc++/51135]: Fix [4.7 Regression] SIGSEGV during exception cleanup on win32
On Mon, 12 Dec 2011, Kai Tietz wrote: Index: gcc/libstdc++-v3/libsupc++/cxxabi.h === --- gcc.orig/libstdc++-v3/libsupc++/cxxabi.h +++ gcc/libstdc++-v3/libsupc++/cxxabi.h @@ -51,6 +51,10 @@ #include bits/cxxabi_tweaks.h #include bits/cxxabi_forced.h +#ifndef _GLIBCXX_USE_THISCALL_ON_DTOR +typedef void (*__cxa_dtor_type) (void *); +#endif + This changes the type from a function with C linkage to one with C++ linkage, is that on purpose? There is a type __cxa_cdtor_type a couple lines below, which also seems used for destructors, but that one doesn't get __thiscall, that's confusing (but then there's probably a reason why it wasn't used in __cxa_throw). (Note: feel free to ignore, those are questions not comments, I don't know this code) -- Marc Glisse
Fix compiler warnings in ThreadSanitizer tests (issue5483046)
This is for google-main branch. Fix compiler warnings in ThreadSanitizer tests. Index: gcc/testsuite/ChangeLog.google-main === --- gcc/testsuite/ChangeLog.google-main (revision 182235) +++ gcc/testsuite/ChangeLog.google-main (working copy) @@ -1,3 +1,10 @@ +2011-12-12 Dmitry Vyukov dvyu...@google.com + + * gcc.dg/tsan.h: Fix compiler warnings. + * gcc.dg/tsan-ignore.c: Fix compiler warnings. + * gcc.dg/tsan-ignore.h: Fix compiler warnings. + * gcc.dg/tsan-mop.c: Fix compiler warnings. + 2011-10-17 Dehao Chen de...@google.com * gcc.dg/record-gcc-switches-in-elf-1.c: New test. Index: gcc/testsuite/gcc.dg/tsan.h === --- gcc/testsuite/gcc.dg/tsan.h (revision 182235) +++ gcc/testsuite/gcc.dg/tsan.h (working copy) @@ -15,7 +15,7 @@ __thread int mop_expect = 0; __thread int mop_depth = 0; __thread void* mop_addr = 0; -__thread unsigned long long mop_pc = 0; +__thread unsigned long mop_pc = 0; __thread unsigned mop_flags = 0; __thread unsigned mop_line = 0; @@ -40,15 +40,16 @@ { if (mop_expect) { - printf (missed mop: addr=%p pc=%d line=%d\n, mop_addr, mop_pc, mop_line); + printf (missed mop: addr=%p pc=%p line=%d\n, + mop_addr, (void*)mop_pc, mop_line); exit (1); } mop_expect = 1; mop_depth = depth; mop_addr = (void*)addr; - mop_pc = (unsigned long long)__builtin_return_address(0); - mop_flags = !!is_sblock | (!!is_store 1) | ((size - 1) 2); + mop_pc = (unsigned long)__builtin_return_address(0); + mop_flags = (!!is_sblock) | ((!!is_store) 1) | ((size - 1) 2); mop_line = line; } @@ -57,7 +58,7 @@ void __tsan_handle_mop (void *addr, unsigned flags) { - unsigned long long pc; + unsigned long pc; int depth; printf (mop: addr=%p flags=%x called from %p line=%d\n, @@ -74,7 +75,7 @@ exit (1); } - pc = (unsigned long long)__builtin_return_address(0); + pc = (unsigned long)__builtin_return_address(0); if (pc mop_pc - 100 || pc mop_pc + 100) { printf (incorrect mop pc: %p/%p line=%d\n, Index: gcc/testsuite/gcc.dg/tsan-ignore.c === --- gcc/testsuite/gcc.dg/tsan-ignore.c (revision 182235) +++ gcc/testsuite/gcc.dg/tsan-ignore.c (working copy) @@ -5,28 +5,32 @@ /* Check ignore file handling. */ -int +void foo (int *p) { p [0] = 1; } -int bar (int *p) +void +bar (int *p) { p [0] = 1; } -int baz (int *p) +void +baz (int *p) { p [0] = 1; } -int bla (int *p) +void +bla (int *p) { p [0] = 1; } -int xxx (int *p) +void +xxx (int *p) { p [0] = 1; } Index: gcc/testsuite/gcc.dg/tsan-ignore.h === --- gcc/testsuite/gcc.dg/tsan-ignore.h (revision 182235) +++ gcc/testsuite/gcc.dg/tsan-ignore.h (working copy) @@ -1,4 +1,4 @@ -int +void in_tsan_ignore_header (int *p) { p [0] = 1; Index: gcc/testsuite/gcc.dg/tsan-mop.c === --- gcc/testsuite/gcc.dg/tsan-mop.c (revision 182235) +++ gcc/testsuite/gcc.dg/tsan-mop.c (working copy) @@ -28,7 +28,7 @@ __tsan_expect_mop(2, p-x, 0, 1, sizeof(p-x), __LINE__); tmp = p-x; - + (void)tmp; } void testfor (P *p) @@ -54,6 +54,7 @@ __tsan_expect_mop(1, p.x, 0, 1, sizeof(p.x), __LINE__); tmp = p.x; + (void)tmp; __tsan_expect_mop(1, p.cp, 1, 1, sizeof(p.cp), __LINE__); p.cp = p.c; -- This patch is available for review at http://codereview.appspot.com/5483046
Re: Memset/memcpy patch
Any update? I will look into it today, but anyway I think it is stage1 material, so we have some time to progress on it. Honza
Re: [patch PR libstdc++/51135]: Fix [4.7 Regression] SIGSEGV during exception cleanup on win32
Hi, On Mon, 12 Dec 2011, Kai Tietz wrote: Index: gcc/libstdc++-v3/libsupc++/cxxabi.h === --- gcc.orig/libstdc++-v3/libsupc++/cxxabi.h +++ gcc/libstdc++-v3/libsupc++/cxxabi.h @@ -51,6 +51,10 @@ #include bits/cxxabi_tweaks.h #include bits/cxxabi_forced.h +#ifndef _GLIBCXX_USE_THISCALL_ON_DTOR +typedef void (*__cxa_dtor_type) (void *); +#endif + This changes the type from a function with C linkage to one with C++ linkage, is that on purpose? Humm, thanks, I didn't really spend time on what was going on *below* the define, only to the right way to implement the mingw specific bits. I guess moving the #ifndef a few lines down, close to the other typedef should be the safe thing to do. That also requires adjustment in the config files, the typedef there must be also wrapped in #ifdef __cplusplus, etc. Please do the Change Kai. There is a type __cxa_cdtor_type a couple lines below, which also seems used for destructors, but that one doesn't get __thiscall, that's confusing (but then there's probably a reason why it wasn't used in __cxa_throw). No idea if it's right for mingw. Paolo.
Re: [patch] add __is_final trait to fix libstdc++/51365
On Mon, Dec 12, 2011 at 5:25 AM, Paolo Carlini paolo.carl...@oracle.com wrote: I think being able to detect a final class is good enough for now, until we find out if there are real problems being encountered as people make more use of C++11. Maybe. But in my opinion we should not rush. Something is wrong here at a more fundamental level. I agree that we should wait a little bit for the dust to settle down. Users should avoid it, and implementors shouldn't go through hoops non commensurable with the benefits of final. Maybe the right primitive is slightly different.
Re: Fix flags for edges from/to entry/exit basic blocks (issue5486043)
On Mon, Dec 12, 2011 at 08:43, Dmitriy Vyukov dvyu...@google.com wrote: Fix flags for edges from/to entry/exit basic blocks. W/o this patch I hit internal asserts when trying to split the edge from entry block. Please specify how you tested it (http://gcc.gnu.org/contribute.html#testing). OK for trunk, if testing succeeds. Diego.
Re: [PATCH] Sink clobbers if EH block contains just clobbers (PR tree-optimization/51117)
Hello, On Fri, 9 Dec 2011, Jakub Jelinek wrote: I had to tweak a little bit the expander conflict checking, because if we have a BB with two incoming EH edges and clobber stmts from both sunk into its beginning, then it would consider both variables (a and b above) to be live at the same time, while there is no insn on which they can actually live at the same time, the PHIs don't mention either of them (and after all, PHIs aren't memory loads), and after the PHIs we have immediately the clobbers. The idea is sound, the implementation can be tidied with the observation that only the first real instruction (instead of the BB start) is the point at which all currently live things need to be conflicted, like in the patch below (only cfgexpand.c part changed). I.e. moving the existing code from add_scope_clobbers_1 a bit is enough. I'm putting this through regstrapping on x86_64-linux and will commit if that succeeds, given rths approval for the other parts. I wonder how to best test this. Ciao, Michael. PR tree-optimization/51117 * tree-eh.c (sink_clobbers): New function. (execute_lower_eh_dispatch): Call it for BBs ending with internally throwing RESX. * cfgexpand.c (add_scope_conflicts_1): Add all conflicts only at the first real instruction. Index: cfgexpand.c === --- cfgexpand.c (revision 182241) +++ cfgexpand.c (working copy) @@ -456,34 +456,14 @@ add_scope_conflicts_1 (basic_block bb, b FOR_EACH_EDGE (e, ei, bb-preds) bitmap_ior_into (work, (bitmap)e-src-aux); - if (for_conflict) -{ - /* We need to add conflicts for everything life at the start of - this block. Unlike classical lifeness for named objects we can't -rely on seeing a def/use of the names we're interested in. -There might merely be indirect loads/stores. We'd not add any -conflicts for such partitions. */ - bitmap_iterator bi; - unsigned i; - EXECUTE_IF_SET_IN_BITMAP (work, 0, i, bi) - { - unsigned j; - bitmap_iterator bj; - EXECUTE_IF_SET_IN_BITMAP (work, i, j, bj) - add_stack_var_conflict (i, j); - } - visit = visit_conflict; -} - else -visit = visit_op; + visit = visit_op; for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (gsi)) { gimple stmt = gsi_stmt (gsi); - if (!is_gimple_debug (stmt)) - walk_stmt_load_store_addr_ops (stmt, work, visit, visit, visit); + walk_stmt_load_store_addr_ops (stmt, work, NULL, NULL, visit); } - for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (gsi)) + for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next (gsi)) { gimple stmt = gsi_stmt (gsi); @@ -501,7 +481,29 @@ add_scope_conflicts_1 (basic_block bb, b bitmap_clear_bit (work, *v); } else if (!is_gimple_debug (stmt)) - walk_stmt_load_store_addr_ops (stmt, work, visit, visit, visit); + { + if (for_conflict + visit == visit_op) + { + /* If this is the first real instruction in this BB we need +to add conflicts for everything life at this point now. +Unlike classical lifeness for named objects we can't +rely on seeing a def/use of the names we're interested in. +There might merely be indirect loads/stores. We'd not add any +conflicts for such partitions. */ + bitmap_iterator bi; + unsigned i; + EXECUTE_IF_SET_IN_BITMAP (work, 0, i, bi) + { + unsigned j; + bitmap_iterator bj; + EXECUTE_IF_SET_IN_BITMAP (work, i, j, bj) + add_stack_var_conflict (i, j); + } + visit = visit_conflict; + } + walk_stmt_load_store_addr_ops (stmt, work, visit, visit, visit); + } } } Index: tree-eh.c === --- tree-eh.c (revision 182241) +++ tree-eh.c (working copy) @@ -3194,6 +3194,76 @@ optimize_clobbers (basic_block bb) } } +/* Try to sink var = {v} {CLOBBER} stmts followed just by + internal throw to successor BB. */ + +static int +sink_clobbers (basic_block bb) +{ + edge e; + edge_iterator ei; + gimple_stmt_iterator gsi, dgsi; + basic_block succbb; + bool any_clobbers = false; + + /* Only optimize if BB has a single EH successor and + all predecessor edges are EH too. */ + if (!single_succ_p (bb) + || (single_succ_edge (bb)-flags EDGE_EH) == 0) +return 0; + + FOR_EACH_EDGE (e, ei, bb-preds) +{ + if ((e-flags EDGE_EH) == 0) + return 0; +} + + /* And BB contains only CLOBBER stmts before the final + RESX. */ + gsi = gsi_last_bb (bb); + for (gsi_prev (gsi); !gsi_end_p (gsi); gsi_prev (gsi)) +
[Patch, wwwdocs, committed] gcc-4.7/changes.html#Fortran: Add polymorphic arrays
I have committed attached patch for http://gcc.gnu.org/gcc-4.7/changes.html#fortran Tobias Index: changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/changes.html,v retrieving revision 1.67 diff -u -p -r1.67 changes.html --- changes.html 6 Dec 2011 00:44:54 - 1.67 +++ changes.html 12 Dec 2011 15:06:56 - @@ -477,6 +477,8 @@ well./p/li that Fortran does not support static constructor functions; only default initialization or an explicit structure-constructor initialization are available./li + lia href=http://gcc.gnu.org/wiki/OOP;Polymorphic/a + (codeclass/code) arrays are now supported./li /ul/li lia href=http://gcc.gnu.org/wiki/Fortran2008Status;Fortran 2008/a: ul
Re: PR/50076 make c-c++-common/cxxbitfields-3.c work in Darwin
On 12/11/11 16:51, Mike Stump wrote: On Dec 9, 2011, at 11:45 AM, Aldy Hernandez wrote: How about the patch below? I'm fine with whatever you guys come up with... Likewise. I have no preference. Whatever gets approved is ok with me.
Re: [PATCH 3/3] arm-linux: Add libitm support.
Hi Richard, Comments inline below.. On Sat, Dec 10, 2011 at 03:21:23PM -0800, Richard Henderson wrote: --- libitm/Makefile.am |3 + libitm/Makefile.in | 20 +++-- libitm/config/arm/hwcap.cc | 67 + libitm/config/arm/hwcap.h| 41 ++ libitm/config/arm/sjlj.S | 135 ++ libitm/config/arm/target.h | 62 libitm/config/generic/asmcfi.h | 13 ++- libitm/config/linux/arm/futex_bits.h | 48 libitm/configure | 18 - libitm/configure.ac |1 + libitm/configure.tgt |2 + 11 files changed, 395 insertions(+), 15 deletions(-) create mode 100644 libitm/config/arm/hwcap.cc create mode 100644 libitm/config/arm/hwcap.h create mode 100644 libitm/config/arm/sjlj.S create mode 100644 libitm/config/arm/target.h create mode 100644 libitm/config/linux/arm/futex_bits.h diff --git a/libitm/Makefile.am b/libitm/Makefile.am index 26e1ebc..d417026 100644 --- a/libitm/Makefile.am +++ b/libitm/Makefile.am @@ -62,6 +62,9 @@ libitm_la_SOURCES = \ query.cc retry.cc rwlock.cc useraction.cc util.cc \ sjlj.S tls.cc method-serial.cc method-gl.cc +if ARCH_ARM +libitm_la_SOURCES += hwcap.cc +endif if ARCH_X86 libitm_la_SOURCES += x86_sse.cc x86_avx.cc x86_sse.lo : XCFLAGS += -msse diff --git a/libitm/Makefile.in b/libitm/Makefile.in index dc77382..5305f4c 100644 --- a/libitm/Makefile.in +++ b/libitm/Makefile.in @@ -36,8 +36,9 @@ POST_UNINSTALL = : build_triplet = @build@ host_triplet = @host@ target_triplet = @target@ -@ARCH_X86_TRUE@am__append_1 = x86_sse.cc x86_avx.cc -@ARCH_FUTEX_TRUE@am__append_2 = futex.cc +@ARCH_ARM_TRUE@am__append_1 = hwcap.cc +@ARCH_X86_TRUE@am__append_2 = x86_sse.cc x86_avx.cc +@ARCH_FUTEX_TRUE@am__append_3 = futex.cc subdir = . DIST_COMMON = $(am__configure_deps) $(srcdir)/../config.guess \ $(srcdir)/../config.sub $(srcdir)/../depcomp \ @@ -99,15 +100,16 @@ libitm_la_LIBADD = am__libitm_la_SOURCES_DIST = aatree.cc alloc.cc alloc_c.cc \ alloc_cpp.cc barrier.cc beginend.cc clone.cc eh_cpp.cc \ local.cc query.cc retry.cc rwlock.cc useraction.cc util.cc \ - sjlj.S tls.cc method-serial.cc method-gl.cc x86_sse.cc \ - x86_avx.cc futex.cc -@ARCH_X86_TRUE@am__objects_1 = x86_sse.lo x86_avx.lo -@ARCH_FUTEX_TRUE@am__objects_2 = futex.lo + sjlj.S tls.cc method-serial.cc method-gl.cc hwcap.cc \ + x86_sse.cc x86_avx.cc futex.cc +@ARCH_ARM_TRUE@am__objects_1 = hwcap.lo +@ARCH_X86_TRUE@am__objects_2 = x86_sse.lo x86_avx.lo +@ARCH_FUTEX_TRUE@am__objects_3 = futex.lo am_libitm_la_OBJECTS = aatree.lo alloc.lo alloc_c.lo alloc_cpp.lo \ barrier.lo beginend.lo clone.lo eh_cpp.lo local.lo query.lo \ retry.lo rwlock.lo useraction.lo util.lo sjlj.lo tls.lo \ method-serial.lo method-gl.lo $(am__objects_1) \ - $(am__objects_2) + $(am__objects_2) $(am__objects_3) libitm_la_OBJECTS = $(am_libitm_la_OBJECTS) DEFAULT_INCLUDES = -I.@am__isrc@ depcomp = $(SHELL) $(top_srcdir)/../depcomp @@ -376,7 +378,8 @@ libitm_la_LDFLAGS = $(libitm_version_info) $(libitm_version_script) libitm_la_SOURCES = aatree.cc alloc.cc alloc_c.cc alloc_cpp.cc \ barrier.cc beginend.cc clone.cc eh_cpp.cc local.cc query.cc \ retry.cc rwlock.cc useraction.cc util.cc sjlj.S tls.cc \ - method-serial.cc method-gl.cc $(am__append_1) $(am__append_2) + method-serial.cc method-gl.cc $(am__append_1) $(am__append_2) \ + $(am__append_3) # Automake Documentation: # If your package has Texinfo files in many directories, you can use the @@ -505,6 +508,7 @@ distclean-compile: @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/clone.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/eh_cpp.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/futex.Plo@am__quote@ +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/hwcap.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/local.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/method-gl.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/method-serial.Plo@am__quote@ diff --git a/libitm/config/arm/hwcap.cc b/libitm/config/arm/hwcap.cc new file mode 100644 index 000..007c10e --- /dev/null +++ b/libitm/config/arm/hwcap.cc @@ -0,0 +1,67 @@ +/* Copyright (C) 2011 Free Software Foundation, Inc. + Contributed by Richard Henderson r...@redhat.com. + + This file is part of the GNU Transactional Memory Library (libitm). + + Libitm is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + Libitm is distributed in the hope
Re: PR/50076 make c-c++-common/cxxbitfields-3.c work in Darwin
I'm fine with whatever you guys come up with... Likewise. I have no preference. Whatever gets approved is ok with me. So let's pick the Iain's proposal: Index: gcc/testsuite/c-c++-common/cxxbitfields-3.c === --- gcc/testsuite/c-c++-common/cxxbitfields-3.c (revision 182177) +++ gcc/testsuite/c-c++-common/cxxbitfields-3.c (working copy) @@ -18,4 +18,5 @@ void setit() var.j = 5; } -/* { dg-final { scan-assembler movl.*, var } } */ +/* { dg-final { scan-assembler movl.*, _?var { target { ! *-*-darwin* } } } } */ +/* { dg-final { scan-assembler movl.*, (_?var|\\(%) { target *-*-darwin* } } } */ Dominique
Re: [patch PR libstdc++/51135]: Fix [4.7 Regression] SIGSEGV during exception cleanup on win32
2011/12/12 Paolo Carlini paolo.carl...@oracle.com: Hi, On Mon, 12 Dec 2011, Kai Tietz wrote: Index: gcc/libstdc++-v3/libsupc++/cxxabi.h === --- gcc.orig/libstdc++-v3/libsupc++/cxxabi.h +++ gcc/libstdc++-v3/libsupc++/cxxabi.h @@ -51,6 +51,10 @@ #include bits/cxxabi_tweaks.h #include bits/cxxabi_forced.h +#ifndef _GLIBCXX_USE_THISCALL_ON_DTOR +typedef void (*__cxa_dtor_type) (void *); +#endif + This changes the type from a function with C linkage to one with C++ linkage, is that on purpose? Humm, thanks, I didn't really spend time on what was going on *below* the define, only to the right way to implement the mingw specific bits. I guess moving the #ifndef a few lines down, close to the other typedef should be the safe thing to do. That also requires adjustment in the config files, the typedef there must be also wrapped in #ifdef __cplusplus, etc. Please do the Change Kai. Ok. By looking at this, it might be better to use here a define - as you mentioned. As I would need to copy here namespace too. There is a type __cxa_cdtor_type a couple lines below, which also seems used for destructors, but that one doesn't get __thiscall, that's confusing (but then there's probably a reason why it wasn't used in __cxa_throw). No idea if it's right for mingw. Well, not sure too. Logically, if those function in cdtor list (handled in vec.cc) are constructors/destructors, then it would require thiscall for IA-32 mingw, too. By tests I see that those function stored within that list have cdecl-calling convention. Therefore I didn't touched them by this patch. Kai Kai
Re: [patch PR libstdc++/51135]: Fix [4.7 Regression] SIGSEGV during exception cleanup on win32
On 12 December 2011 12:42, Kai Tietz wrote: PR libstdc++/511135 * libsupc++/cxxabi.h (__cxxabi_dtor_type): New type. ChangeLog needs to be updated for the new type name (whether it ends up being __cxa_dtor_type or something else).
Re: [patch PR libstdc++/51135]: Fix [4.7 Regression] SIGSEGV during exception cleanup on win32
Hi, Ok. By looking at this, it might be better to use here a define - as you mentioned. As I would need to copy here namespace too. Ok, thanks. Let's make sure nothing can possibly change for != mingw, we don't want to take risks at this time. Paolo.
Re: PR/50076 make c-c++-common/cxxbitfields-3.c work in Darwin
On Mon, Dec 12, 2011 at 04:20:57PM +0100, Dominique Dhumieres wrote: I'm fine with whatever you guys come up with... Likewise. I have no preference. Whatever gets approved is ok with me. So let's pick the Iain's proposal: Index: gcc/testsuite/c-c++-common/cxxbitfields-3.c === --- gcc/testsuite/c-c++-common/cxxbitfields-3.c (revision 182177) +++ gcc/testsuite/c-c++-common/cxxbitfields-3.c (working copy) @@ -18,4 +18,5 @@ void setit() var.j = 5; } -/* { dg-final { scan-assembler movl.*, var } } */ +/* { dg-final { scan-assembler movl.*, _?var { target { ! *-*-darwin* } } } } */ +/* { dg-final { scan-assembler movl.*, (_?var|\\(%) { target *-*-darwin* } } } */ Only if *-*-darwin* is replaced with !nonpic, otherwise it will fail say on x86_64-linux if people test with RUNTESTFLAGS=--target_board=unix/-fpic Jakub
Re: [PATCH] Fix PR middle-end/45416, missing opt for (a(1C))!=0 to (aC)1
Hi, On Fri, 9 Dec 2011, Georg-Johann Lay wrote: This is pretty much straight forward, and I don't understand the problems with - canonicalize stuff - optimize on canonicalized representation - lower canonicalized representation to best RTL I don't think anyone would reject patches that do this generally. Absent such patches GCC will invariably always be tuned for targets that most developers care about; even at the expense of smaller targets. Ciao, Michael.
Re: PR/50076 make c-c++-common/cxxbitfields-3.c work in Darwin
On 12 Dec 2011, at 15:47, Jakub Jelinek wrote: On Mon, Dec 12, 2011 at 04:20:57PM +0100, Dominique Dhumieres wrote: I'm fine with whatever you guys come up with... Likewise. I have no preference. Whatever gets approved is ok with me. So let's pick the Iain's proposal: Index: gcc/testsuite/c-c++-common/cxxbitfields-3.c === --- gcc/testsuite/c-c++-common/cxxbitfields-3.c (revision 182177) +++ gcc/testsuite/c-c++-common/cxxbitfields-3.c (working copy) @@ -18,4 +18,5 @@ void setit() var.j = 5; } -/* { dg-final { scan-assembler movl.*, var } } */ +/* { dg-final { scan-assembler movl.*, _?var { target { ! *-*- darwin* } } } } */ +/* { dg-final { scan-assembler movl.*, (_?var|\\(%) { target *-*- darwin* } } } */ Only if *-*-darwin* is replaced with !nonpic, otherwise it will fail say on x86_64-linux if people test with RUNTESTFLAGS=--target_board=unix/-fpic thus is everyone reasonably happy with? Index: gcc/testsuite/c-c++-common/cxxbitfields-3.c === --- gcc/testsuite/c-c++-common/cxxbitfields-3.c (revision 182219) +++ gcc/testsuite/c-c++-common/cxxbitfields-3.c (working copy) @@ -18,4 +18,5 @@ void setit() var.j = 5; } -/* { dg-final { scan-assembler movl.*, var } } */ +/* { dg-final { scan-assembler movl.*, _?var { target nonpic } } } */ +/* { dg-final { scan-assembler movl.*, (_?var|\\(%) { target { ! nonpic } } } } */
Re: [patch] PR51347 alias problem
Yes the testcase attached in the PR works for me but I can't change the status because I am not the reporter (nor admin). I will close it. However, the testcase I have added g++.dg/tm/ctor-used.C fails. I can fill another PR but I found this problem thanks to the PR testcase. If you mean the following test, there is no ICE here either with current sources. However, I do see that you expect something else to be generated. If I compile it with optimization (-O1), there is no call to the runtime as expected (no _ITM_getTMCloneOrIrrevocable), we just inline the initialization to 0 inside the transaction. And we optimize away the constructor C():l(0). If I compile with no optimization, there is a the call through the runtime (which according to your test you DONT expect, why?), and we generate code for C():l(0). This seems correct. I don't see anything wrong with the generated code. What are you expecting? +/* { dg-do compile } */ +/* { dg-options -fgnu-tm -fdump-tree-optimized } */ + +struct C { + long l; + C():l(0) {} +}; + +int main() +{ + C* alloc; + __transaction_atomic { +alloc = new C; + } + alloc-l = 2; + + return 0; +} +/* { dg-final { scan-assembler-not _ITM_getTMCloneOrIrrevocable } } */ +/* { dg-final { scan-tree-dump-times ;; Function C::C 1 optimized } } */ +/* { dg-final { cleanup-tree-dump optimized } } */
Re: PR/50076 make c-c++-common/cxxbitfields-3.c work in Darwin
On Mon, Dec 12, 2011 at 04:18:29PM +, Iain Sandoe wrote: thus is everyone reasonably happy with? Index: gcc/testsuite/c-c++-common/cxxbitfields-3.c === --- gcc/testsuite/c-c++-common/cxxbitfields-3.c (revision 182219) +++ gcc/testsuite/c-c++-common/cxxbitfields-3.c (working copy) @@ -18,4 +18,5 @@ void setit() var.j = 5; } -/* { dg-final { scan-assembler movl.*, var } } */ +/* { dg-final { scan-assembler movl.*, _?var { target nonpic } } } */ +/* { dg-final { scan-assembler movl.*, (_?var|\\(%) { target { ! nonpic } } } } */ Yes, this is ok for the trunk with proper ChangeLog entry. Jakub
Re: Fix flags for edges from/to entry/exit basic blocks (issue 5486043)
On 11-12-12 11:18 , dvyu...@google.com wrote: I've done full 3 stage build for all front-ends, then 'make bootstrap', then diff output of 'make check-gcc -j16 RUNTESTFLAGS=dg.exp' with non-modified version. Everything passed successfully. All that on Linux/amd64. OK, thanks. That's enough. Unfortunately that other script for gcc testing does not work on my machine... Other script? You mean the one we use internally? Grab me on IM. Diego.
Re: RFC: ARM 64-bit shifts in NEON
On 07/12/11 13:42, Richard Earnshaw wrote: So it looks like the code generated for core registers with thumb2 is pretty rubbish (no real surprise there -- to get the best code you need to make use of the fact that on ARM a shift by a small negative number ( -128) will give zero. This gives us sequences like: For ARM state it's something like (untested) @ shft 32 , shft= 32 __ashldi3_v3: sub r3, r2, #32 @ -ve , shft - 32 lsl ah, ah, r2 @ ah shft , 0 rsb ip, r2, #32 @ 32 - shft , -ve orr ah, ah, al, lsl r3 @ ah shft , al shft - 32 orr ah, ah, al, lsr ip @ ah shft | al 32 - shft , al shft - 32 lsl al, al, r2 @ al shft , 0 For Thumb2 (where there is no orr with register shift) lslsah, ah, r2 @ ah shft , 0 sub r3, r2, #32 @ -ve , shft - 32 lsl ip, al, r3 @ 0 , al shft - 32 negsr3, r3 @ 32 - shft , -ve orr ah, ah, ip @ ah shft , al shft - 32 lsr r3, al, r3 @ al 32 - shft , 0 orrsah, ah, r3 @ ah shft | al 32 - shft , al shft - 32 lslsal, al, r2 @ al shft , 0 Neither of which needs the condition flags during execution (and indeed is probably better in both cases than the code currently in lib1funcs.asm for a modern core). The flag clobbering behaviour in the thumb2 variant is only for code size saving; that would normally be added by a late optimization pass. None of this directly helps with your neon usage, but it does show that we really don't need to clobber the condition code register to get an efficient sequence. Unfortunately, both these sequences use two scratch registers, as shown, and that's worse than clobbering CC. Now, I can implement this for non-Neon easily enough, I think, and that would be a win, but I'm trying to figure out how best to do it for both that case and the case where neon is available but the compiler chooses not to do it. The problem is that when there is no neon available, this can be converted at expand or split1 time, but when neon *is* available we have to wait until a post-reload split, and then we'd be forced to expand this in early-clobber mode, which is far less optimal. Any suggestions now to do this without pessimizing the code in the case that neon is available but not used? In fact, is the general shift operation sufficiently expensive that I should I just abandon the fall back alternatives and *always* use Neon when available? In this case, what about A8 vs. A9? Thanks Andrew
Re: [PATCH] Sink clobbers if EH block contains just clobbers (PR tree-optimization/51117)
Hi, On Mon, 12 Dec 2011, Jakub Jelinek wrote: Looks cleaner, yes. Just I wonder: 1) what if a bb contains no real insns (I know, they should be optimized out, but do we assert that etc.?) - then the EXECUTE_IF_SET_IN_BITMAP loop just wouldn't be done. Perhaps that is fine, it would make it into the bitmap at the end of the bb and perhaps following bb would do this loop. Not only perhaps. That is exactly what will happen. If some of the successor BBs then has real instructions _that_ one will cause creation of all the necessary conflicts. 2) the PHIs are then handled always with visit_op instead of visit_conflict, I'd guess the needed add_stack_var_conflict calls would then happen in that EXECUTE_IF_SET_IN_BITMAP loop, right? Correct. The PHIs don't need to create the conflicts, any new mention of a DECL name will be noted as active, and then creates a conflict at the next real instruction (if not cancelled by a clobber before). I wonder how to best test this. One kind of testing was watching the size of .gcc_except_table going down with each patch (vanilla - optimize_cloobers - optimize_clobbers thinko fix - sink_clobbers). My idea was to somehow check the EH tree for some dump (.ehcleanup2 perhaps) for being in the expected form, or that the correct number of removals happen. And of course that the sharing still happens, but that's even worse to test in the .expand dump :-/ Ciao, Michael.
Re: [PATCH] Sink clobbers if EH block contains just clobbers (PR tree-optimization/51117)
On Mon, Dec 12, 2011 at 05:29:09PM +0100, Michael Matz wrote: On Mon, 12 Dec 2011, Jakub Jelinek wrote: Just I wonder: 1) what if a bb contains no real insns (I know, they should be optimized out, but do we assert that etc.?) - then the EXECUTE_IF_SET_IN_BITMAP loop just wouldn't be done. Perhaps that is fine, it would make it into the bitmap at the end of the bb and perhaps following bb would do this loop. Not only perhaps. That is exactly what will happen. If some of the successor BBs then has real instructions _that_ one will cause creation of all the necessary conflicts. Ok. 2) the PHIs are then handled always with visit_op instead of visit_conflict, I'd guess the needed add_stack_var_conflict calls would then happen in that EXECUTE_IF_SET_IN_BITMAP loop, right? Correct. The PHIs don't need to create the conflicts, any new mention of a DECL name will be noted as active, and then creates a conflict at the next real instruction (if not cancelled by a clobber before). Ok. So, I'm happy with your changes and rth already acked the tree-eh.c side, so can we just get an ack on these cfgexpand.c changes? Thanks. The testcases can perhaps follow when they are ready (and I plan to submit at least the one checking for no __cxa_rethrow in assembly). Jakub
[committed] Don't grow internal_arg_pointer_exp_state.cache vector when idx is smaller than VEC_length (PR middle-end/51510)
Hi! When idx is smaller than VEC_length (which happens when some pseudo is set more than once in the tail call sequence), we should try to grow the vector to smaller size than it has. Bootstrapped/regtested on x86_64-linux and i686-linux, tested in arm cross on the testcase mentioned in the PR with -march=iwmmxt -O2, commited to trunk as obvious. Will backport to 4.6 soon. 2011-12-12 Jakub Jelinek ja...@redhat.com PR middle-end/51510 * calls.c (internal_arg_pointer_based_exp_scan): Don't use VEC_safe_grow_cleared if idx is smaller than VEC_length. --- gcc/calls.c.jj 2011-12-08 16:36:42.0 +0100 +++ gcc/calls.c 2011-12-12 09:59:26.543358601 +0100 @@ -1705,9 +1705,11 @@ internal_arg_pointer_based_exp_scan (voi val = internal_arg_pointer_based_exp (SET_SRC (set), false); if (val != NULL_RTX) { - VEC_safe_grow_cleared (rtx, heap, -internal_arg_pointer_exp_state.cache, -idx + 1); + if (idx + = VEC_length (rtx, internal_arg_pointer_exp_state.cache)) + VEC_safe_grow_cleared (rtx, heap, + internal_arg_pointer_exp_state.cache, + idx + 1); VEC_replace (rtx, internal_arg_pointer_exp_state.cache, idx, val); } Jakub
[committed] Fix gcc.dg/pr45819.c testcase on arm (PR testsuite/51511)
Hi! On ARM because -fstrict-volatile-bitfields is on we get a warning about volatile access to unaligned field, this patch adds -w to avoid failing because of that warning. Regtested on x86_64-linux and i686-linux and with arm cross on the given testcase, committed as obvious. 2011-12-12 Jakub Jelinek ja...@redhat.com PR testsuite/51511 * gcc.dg/pr45819.c: Add -w to dg-options. --- gcc/testsuite/gcc.dg/pr45819.c.jj 2011-07-22 22:14:59.0 +0200 +++ gcc/testsuite/gcc.dg/pr45819.c 2011-12-12 10:10:43.946416951 +0100 @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options -O2 -fdump-tree-optimized } */ +/* { dg-options -O2 -fdump-tree-optimized -w } */ struct ehci_regs { char x; Jakub
Re: [PATCH 3/6] ia64: Implement vec_perm_const.
Richard, I am hitting an assert in expand_vec_perm_even_odd on IA64 HP-UX with your patch. Using gcc.c-torture/compile/900116-1.c as a test case and compiling at -O3 I get: x.c: In function 'zloop': x.c:3:1: internal compiler error: in expand_vec_perm_even_odd, at config/ia64/ia64.c:11145 Please submit a full bug report, with preprocessed source if appropriate. See http://gcc.gnu.org/bugs.html for instructions. Looking in the debugger it appears that nelt is 8 at the assertion. Steve Ellcey s...@cup.hp.com
[PATCH] Call maybe_clean_or_redirect_eh_stmt in gimple_fold_call (PR tree-optimization/51481)
Hi! In gimple_fold_call (called from fold_stmt from replace_uses_by from cfg cleanup) we weren't calling maybe_clean_or_replace_eh_stmt, so when we've replaced a printf call (which can throw) with puts (which can throw too), nothing would update EH stmts. It would be problematic calling gimple_purge_dead_eh_edges, because callers might be surprised by that, especially when this happens during cfg cleanup, so instead I just assert it is not needed and don't try to fold if a throwing stmt would be replaced by non-throwing. FAB pass can handle that instead. No folding has been actually disabled because of that check during bootstrap/regtest, so it is there just in case. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2011-12-12 Jakub Jelinek ja...@redhat.com PR tree-optimization/51481 * gimple-fold.c (gimple_fold_call): Call maybe_clean_or_replace_eh_stmt. Avoid optimization if stmt has EH edges, but gimple_fold_builtin result can't throw. * gcc.dg/pr51481.c: New test. --- gcc/gimple-fold.c.jj2011-12-11 22:02:37.0 +0100 +++ gcc/gimple-fold.c 2011-12-12 11:42:34.740168390 +0100 @@ -1117,10 +1117,21 @@ gimple_fold_call (gimple_stmt_iterator * if (callee DECL_BUILT_IN (callee)) { tree result = gimple_fold_builtin (stmt); - if (result) + if (result + /* Disallow EH edge removal here. We can't call +gimple_purge_dead_eh_edges here. */ + (lookup_stmt_eh_lp (stmt) == 0 + || tree_could_throw_p (result))) { if (!update_call_from_tree (gsi, result)) gimplify_and_update_call_from_tree (gsi, result); + if (!gsi_end_p (*gsi)) + { + gimple new_stmt = gsi_stmt (*gsi); + bool update_eh ATTRIBUTE_UNUSED + = maybe_clean_or_replace_eh_stmt (stmt, new_stmt); + gcc_assert (!update_eh); + } changed = true; } } --- gcc/testsuite/gcc.dg/pr51481.c.jj 2011-12-12 11:18:27.304678207 +0100 +++ gcc/testsuite/gcc.dg/pr51481.c 2011-12-12 11:18:02.0 +0100 @@ -0,0 +1,33 @@ +/* PR tree-optimization/51481 */ +/* { dg-do compile } */ +/* { dg-options -O -fexceptions -fipa-cp -fipa-cp-clone } */ + +extern const unsigned short int **foo (void) + __attribute__ ((__nothrow__, __const__)); +struct S { unsigned short s1; int s2; }; +extern struct S *s[26]; + +void +bar (int x, struct S *y, ...) +{ + static struct S *t; + __builtin_va_list ap; + __builtin_va_start (ap, y); + if (t != s[7]) +{ + const char *p = aAbBc; + t = s[7]; + while ((*foo ())[(unsigned char) *p]) + p++; +} + __builtin_printf (x == 0 ? abc\n : def\n); + if (y != 0) +__builtin_printf (ghi %d %d, y-s2, y-s1); + __builtin_va_end (ap); +} + +void +baz (char *x) +{ + bar (1, 0, x); +} Jakub
Re: Fix compiler warnings in ThreadSanitizer tests (issue 5483046)
ok for google/main. David http://codereview.appspot.com/5483046/
Re: [PATCH] Sink clobbers if EH block contains just clobbers (PR tree-optimization/51117)
Hi, On Mon, 12 Dec 2011, Jakub Jelinek wrote: Ok. So, I'm happy with your changes and rth already acked the tree-eh.c side, so can we just get an ack on these cfgexpand.c changes? Thanks. Hmpf, I would have simply committed without a re-approval, but if you think it's necessary I'll wait. FYI, I've actually regstrapped with the patch starting iterating from i+1 in the nested EXECUTE_IF_SET_IN_BITMAP to ignore the diagonal. Ciao, Michael.
[C++ PATCH] Fix for-2.C OpenMP regression (PR c++/51496)
Hi! On extern void baz (int); template long N void f7 (int i, int x, int y) { #pragma omp parallel for for (i = x - 10; i = y + 10; i += N) baz (i); } part of libgomp.c++/for-2.C testcase we now ICE, because the increment expression contains IMPLICIT_CONV_EXPR. Fixed by also using cp_parser_omp_for_incr when processing_template_decl and, while decl is NULL, real_decl is non-NULL (if decl is non-NULL, real_decl is equal to that). Bootstrapped/regtested on x86_64-linux and i686-linux, does this look ok to you? 2011-12-12 Jakub Jelinek ja...@redhat.com PR c++/51496 * parser.c (cp_parser_omp_for_loop): When determining whether to use cp_parser_omp_for_incr or cp_parser_expression and when calling cp_parser_omp_for_incr, use real_decl instead of decl. --- gcc/cp/parser.c.jj 2011-12-11 22:02:36.0 +0100 +++ gcc/cp/parser.c 2011-12-12 13:11:27.338530238 +0100 @@ -26304,11 +26304,11 @@ cp_parser_omp_for_loop (cp_parser *parse { /* If decl is an iterator, preserve the operator on decl until finish_omp_for. */ - if (decl + if (real_decl ((processing_template_decl - !POINTER_TYPE_P (TREE_TYPE (decl))) - || CLASS_TYPE_P (TREE_TYPE (decl - incr = cp_parser_omp_for_incr (parser, decl); + !POINTER_TYPE_P (TREE_TYPE (real_decl))) + || CLASS_TYPE_P (TREE_TYPE (real_decl + incr = cp_parser_omp_for_incr (parser, real_decl); else incr = cp_parser_expression (parser, false, NULL); } Jakub
Re: [PATCH] Sink clobbers if EH block contains just clobbers (PR tree-optimization/51117)
On 12/12/2011 09:03 AM, Michael Matz wrote: Hmpf, I would have simply committed without a re-approval, but if you think it's necessary I'll wait. The revised patch is ok too. r~
Re: [patch] PR51347 alias problem
On 12/12/2011 11:19 AM, Aldy Hernandez wrote: Yes the testcase attached in the PR works for me but I can't change the status because I am not the reporter (nor admin). I will close it. Ok thanks. However, the testcase I have added g++.dg/tm/ctor-used.C fails. I can fill another PR but I found this problem thanks to the PR testcase. If you mean the following test, there is no ICE here either with current sources. However, I do see that you expect something else to be generated. If I compile it with optimization (-O1), there is no call to the runtime as expected (no _ITM_getTMCloneOrIrrevocable), we just inline the initialization to 0 inside the transaction. And we optimize away the constructor C():l(0). If I compile with no optimization, there is a the call through the runtime (which according to your test you DONT expect, why?), and we generate code for C():l(0). This seems correct. This is not correct. First, _ITM_getTMCloneOrIrrevocable should never appear in a __transaction_atomic (_ITM_getTMClone is ok). But the problem here is that it fails to detect the clone because of the alias. This is why we end up with a call to _ITM_getTMCloneOrIrrevocable. I don't see anything wrong with the generated code. What are you expecting? I expect a direct call to the clone constructor without asking the runtime (as you see with my patch). Patrick.
Re: [v3] doxygen warnings
This patch just removes/restructures some of the doxygen markup to avoid warnings when generating the documentation. Most of the libstdc++ headers are pretty doxygen clean now. By the way, I recently made this feature request for Doxygen: https://bugzilla.gnome.org/show_bug.cgi?id=665506 That would allow us to refer to pos not __pos in the doxygen comments, and for the generated docs to be much nicer to read, without uglified names. Awesome. That would be useful. At this point, most of the libstdc++ headers have pretty much warning-free docs, at least with current doxygen binaries. The macro trickery in PB_DS kind of makes doxygen go crazy. And some of the new C++11 features like variadic and mutable/default/deleted etc. result in weird/humorous messages. That said, the latest doxygen (1.7.6?) fails miserably, on cxxabi.h no less. And the PDF_HYPERLINKS issue is still present. Some of the time I think I can at least pinpoint the file with the markup that is making this crazy/bad by editing out docs/doxygen/user.cfg.in to only be all the files in bits or all the files in ext, etc. Alas, I've not been able to get anything reproducible. But this argument thing would really be nice to have. I went ahead and uglified the markup in algorithm since that was triggering so many warnings elsewere. -benjamin
Re: [patch] PR51347 alias problem
This is not correct. First, _ITM_getTMCloneOrIrrevocable should never appear in a __transaction_atomic (_ITM_getTMClone is ok). But the problem here is that it fails to detect the clone because of the alias. This is why we end up with a call to _ITM_getTMCloneOrIrrevocable. Ah, I see. Please open a new PR for this. This is something completely different from the aforementioned PR. CC me or assign it to me, I will take a look. Thanks.
Re: RFC: ARM 64-bit shifts in NEON
On 12/12/11 16:28, Andrew Stubbs wrote: On 07/12/11 13:42, Richard Earnshaw wrote: So it looks like the code generated for core registers with thumb2 is pretty rubbish (no real surprise there -- to get the best code you need to make use of the fact that on ARM a shift by a small negative number ( -128) will give zero. This gives us sequences like: For ARM state it's something like (untested) @ shft 32 , shft= 32 __ashldi3_v3: sub r3, r2, #32 @ -ve , shft - 32 lsl ah, ah, r2 @ ah shft, 0 rsb ip, r2, #32 @ 32 - shft , -ve orr ah, ah, al, lsl r3 @ ah shft, al shft - 32 orr ah, ah, al, lsr ip @ ah shft | al 32 - shft , al shft - 32 lsl al, al, r2 @ al shft, 0 For Thumb2 (where there is no orr with register shift) lslsah, ah, r2 @ ah shft, 0 sub r3, r2, #32 @ -ve , shft - 32 lsl ip, al, r3 @ 0 , al shft - 32 negsr3, r3 @ 32 - shft , -ve orr ah, ah, ip @ ah shft, al shft - 32 lsr r3, al, r3 @ al 32 - shft , 0 orrsah, ah, r3 @ ah shft | al 32 - shft , al shft - 32 lslsal, al, r2 @ al shft, 0 Neither of which needs the condition flags during execution (and indeed is probably better in both cases than the code currently in lib1funcs.asm for a modern core). The flag clobbering behaviour in the thumb2 variant is only for code size saving; that would normally be added by a late optimization pass. None of this directly helps with your neon usage, but it does show that we really don't need to clobber the condition code register to get an efficient sequence. Unfortunately, both these sequences use two scratch registers, as shown, and that's worse than clobbering CC. Now, I can implement this for non-Neon easily enough, I think, and that would be a win, but I'm trying to figure out how best to do it for both that case and the case where neon is available but the compiler chooses not to do it. The problem is that when there is no neon available, this can be converted at expand or split1 time, but when neon *is* available we have to wait until a post-reload split, and then we'd be forced to expand this in early-clobber mode, which is far less optimal. Any suggestions now to do this without pessimizing the code in the case that neon is available but not used? In fact, is the general shift operation sufficiently expensive that I should I just abandon the fall back alternatives and *always* use Neon when available? In this case, what about A8 vs. A9? Thanks Andrew Can't you write the pattern with the scratch registers, but use X as the constraint when neon (so that no register gets allocated)? It's easier to do that for real registers than the condition codes register because they really can just go away if they aren't needed. R.
[PATCH] Don't ICE trying to redirect abnormal edges during shrink-wrapping (PR rtl-optimization/51495)
Hi! On this testcase we ICE starting from http://gcc.gnu.org/viewcvs?root=gccview=revrev=181188 because one of the basic blocks we add to bb_tail has EDGE_ABNORMAL edge (computed goto) from a basic block that doesn't need prologue. We try to duplicate the basic block and finally redirect that EDGE_ABNORMAL edge, which ICEs. Fixed by not putting basic blocks into bb_tail if they have complex edges from basic blocks that don't need prologue. In x8_64/i686 bootstraps/regtests this only affected compile/pr51495.c, compile/pr28489.c, torture/pr42462.C and execute/980526-1.c testcases. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2011-12-12 Jakub Jelinek ja...@redhat.com PR rtl-optimization/51495 * function.c (thread_prologue_and_epilogue_insns): Don't add to bb_tail basic blocks that have EDGE_COMPLEX predecessor edges from basic blocks not needing prologue. * gcc.c-torture/compile/pr51495.c: New test. --- gcc/function.c.jj 2011-12-11 22:02:37.0 +0100 +++ gcc/function.c 2011-12-12 14:31:01.821624702 +0100 @@ -5956,9 +5956,22 @@ thread_prologue_and_epilogue_insns (void FOR_EACH_EDGE (e, ei, tmp_bb-preds) if (single_succ_p (e-src) !bitmap_bit_p (bb_on_list, e-src-index) -can_duplicate_block_p (e-src) -bitmap_set_bit (bb_tail, e-src-index)) - VEC_quick_push (basic_block, vec, e-src); +can_duplicate_block_p (e-src)) + { + edge pe; + edge_iterator pei; + + /* If there is predecessor of e-src which doesn't + need prologue and the edge is complex, + we might not be able to redirect the branch + to a copy of e-src. */ + FOR_EACH_EDGE (pe, pei, e-src-preds) + if ((pe-flags EDGE_COMPLEX) != 0 + !bitmap_bit_p (bb_flags, pe-src-index)) + break; + if (pe == NULL bitmap_set_bit (bb_tail, e-src-index)) + VEC_quick_push (basic_block, vec, e-src); + } } /* Now walk backwards from every block that is marked as needing --- gcc/testsuite/gcc.c-torture/compile/pr51495.c.jj2011-12-12 14:32:45.047017210 +0100 +++ gcc/testsuite/gcc.c-torture/compile/pr51495.c 2011-12-12 14:32:15.0 +0100 @@ -0,0 +1,14 @@ +/* PR rtl-optimization/51495 */ + +void bar (void); + +int +foo (int i) +{ + static const void *const table[] = { begin, end }; + goto *(table[i]); +begin: + bar (); +end: + return 0; +} Jakub
Re: [patch PR libstdc++/51135]: Fix [4.7 Regression] SIGSEGV during exception cleanup on win32
On 12/12/2011 04:28 PM, Paolo Carlini wrote: Hi, Ok. By looking at this, it might be better to use here a define - as you mentioned. As I would need to copy here namespace too. Ok, thanks. Let's make sure nothing can possibly change for != mingw, we don't want to take risks at this time. For the time being I reverted the whole thing, the unwind-cxx.h bits in particular made me very nervous, because we have C++ linkage in that case. Please make sure to handle it correctly in the next try. Paolo.
Re: [patch PR libstdc++/51135]: Fix [4.7 Regression] SIGSEGV during exception cleanup on win32
On Mon, 12 Dec 2011, Paolo Carlini wrote: On 12/12/2011 04:28 PM, Paolo Carlini wrote: Hi, Ok. By looking at this, it might be better to use here a define - as you mentioned. As I would need to copy here namespace too. Ok, thanks. Let's make sure nothing can possibly change for != mingw, we don't want to take risks at this time. For the time being I reverted the whole thing, the unwind-cxx.h bits in particular made me very nervous, because we have C++ linkage in that case. Please make sure to handle it correctly in the next try. Actually, g++ currently simply ignores the linkage as part of function types, so this shouldn't have any effect. But it won't hurt to keep it inside extern C ;-) -- Marc Glisse
Re: [C++ PATCH] Fix for-2.C OpenMP regression (PR c++/51496)
OK. Jason