Re: [debug-early] Handle specification of class scoped static functions
On 03/20/2015 02:21 PM, Jason Merrill wrote: I think we want to drop the debug_early check there entirely; the added conditions seem to be gutting it. If is_cu_die (old_die->die_parent) is false, then class_or_namespace_scope_p (old_die->die_parent) ought to be true. Jason Good catch. I am so glad you are keeping track of all this spaghetti, but in my defense, it was pasta already. With the attached I also got rid of one superfluous check for `old_die', as well as your suggestion. We get rid of one more gdb regression. Yay. I'll wait for your high-five (or OK) before committing to branch. Thanks. Aldy commit b3d910713d27dc29801f3ddbe8671a4a6e0de4c1 Author: Aldy Hernandez Date: Fri Mar 20 09:55:31 2015 -0700 Handle specification of class scoped static functions. Remove superfluous check for old_die. diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c index 8884afd..7a52dc8 100644 --- a/gcc/dwarf2out.c +++ b/gcc/dwarf2out.c @@ -18735,7 +18735,7 @@ gen_subprogram_die (tree decl, dw_die_ref context_die) much as possible. */ else if (old_die) { - dumped_early = old_die && old_die->dumped_early; + dumped_early = old_die->dumped_early; /* A declaration that has been previously dumped needs no additional information. */ @@ -18768,13 +18768,23 @@ gen_subprogram_die (tree decl, dw_die_ref context_die) apply; we just use the old DIE. */ expanded_location s = expand_location (DECL_SOURCE_LOCATION (decl)); struct dwarf_file_data * file_index = lookup_filename (s.file); - if (((is_cu_die (old_die->die_parent) - || context_die == NULL - || dumped_early) + if ((is_cu_die (old_die->die_parent) + || context_die == NULL + /* For class scoped static functions, the dumped early + version was the declaration, whereas the next time + around with a different context should be the + specification. In this case, avoid reusing the DIE, but + generate a specification below. E.g.: + + class C { + public: + static void moo () {} + }; */ + || !is_cu_die (context_die)) && (DECL_ARTIFICIAL (decl) || (get_AT_file (old_die, DW_AT_decl_file) == file_index && (get_AT_unsigned (old_die, DW_AT_decl_line) - == (unsigned) s.line) + == (unsigned) s.line { subr_die = old_die;
Re: [PATCH, rs6000, testsuite] Fix PR target/64579, __TM_end __builtin_tend failed to return transactional state
On Fri, 2015-03-20 at 15:52 -0500, Segher Boessenkool wrote: > Maybe it would be nicer if the builtin-expansion code handled the copy > from cc, instead of stacking on RTL expanders. That would allow getting rid of the expanders completely, which would be nice. I'd have to somehow add some type of RS6000_BTC_* flag onto the builtin though, so I can tell during builtin expansion whether I need to emit the cr copy code or not. I'll give that a try ...when I return from sunny Mexico in a week. :-) Thanks for the suggestion. > > (define_expand "tabortdc" [snip] > > + (match_operand 1 "u5bit_cint_operand" "n") [snip] > > Expanders have no constraints (you can leave out the field completely). > Doesn't gen* warn on non-empty constraints? Correct, and David mentioned this when I first submitted the original HTM patch, but I replied they were added to allow better error messages when people used out of range integers for builtin args: https://gcc.gnu.org/ml/gcc-patches/2013-07/msg00565.html > > --- gcc/testsuite/gcc.target/powerpc/htm-1.c(revision 0) > > +++ gcc/testsuite/gcc.target/powerpc/htm-1.c(working copy) > > @@ -0,0 +1,53 @@ > > +/* { dg-do run { target { powerpc*-*-* && htm_hw } } } */ > > +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ > > htm_hw already disallows Darwin? [ And {"*"} {""} is default. ] > > > + # For now, disable on Darwin > > + if { [istarget powerpc-*-eabi] || [istarget powerpc*-*-eabispe] || > > [istarget *-*-darwin*]} { > > + expr 0 > > If ever HTM is enabled on Darwin, the testcases should Just Work as far > as I see. Heh, just cut/pasted the dg-do-skip from another test case, but yes, we can remove it. Thanks. It's too bad we can't have a "dg-do run" and a "dg-do compile", one used when we're on HTM hardware and the other when we're not, so we can at least compile the test case. Peter
[Patch, Fortran, 4.8/4.9/5 Regression] PR59513 READ or WRITE not allowed after EOF
The attached patch allows the attempt to READ or WRITE after an EOF for legacy code. The runtime error is suppressed for -std=legacy and -std=gnu. For standard conformance the error is retained as is now. Regression tested on x86-64. Test case endfile_3.f90 is updated by the patch. OK for trunk and then 4.9, 4.8? Regards, Jerry 2015-03-20 Jerry DeLisle PR libgfortran/59513 * io/transfer.c (data_transfer_init): Only error if -std=f95 or higher. Index: libgfortran/io/transfer.c === --- libgfortran/io/transfer.c (revision 221543) +++ libgfortran/io/transfer.c (working copy) @@ -2533,7 +2533,8 @@ data_transfer_init (st_parameter_dt *dtp, int read return; } - if (dtp->u.p.current_unit->endfile == AFTER_ENDFILE) + if (!(compile_options.allow_std & GFC_STD_GNU) && + dtp->u.p.current_unit->endfile == AFTER_ENDFILE) { generate_error (&dtp->common, LIBERROR_OPTION_CONFLICT, "Sequential READ or WRITE not allowed after " Index: gcc/testsuite/gfortran.dg/endfile_3.f90 === --- gcc/testsuite/gfortran.dg/endfile_3.f90 (revision 221473) +++ gcc/testsuite/gfortran.dg/endfile_3.f90 (working copy) @@ -1,9 +1,10 @@ ! { dg-do run { target fd_truncate } } +! { dg-options -std=f95 } ! pr44477 READ/WRITE not allowed after ENDFILE !--- open(10, form='formatted', & action='write', position='rewind', status="scratch") endfile(10) - write(10,'(a)') "aa" ! { dg-shouldfail "Cannot perform ENDFILE" } + write(10,'(a)')"aa"! { dg-shouldfail "not allowed after EOF marker" } end
Re: GCC 5 Status Report (2015-03-20)
> Please also prioritize fixing P1s and avoid pushing in risky > fixes for P2s that might end up causing regressions. We are still > seeing way too many changes in the IPA area (hi Honza!). Hello :) GCC 5 is a busy release from IPA point of view indeed. Here is a quick summary what is still on my radar for GCC-5: Bug 62051 - [4.9/5 Regression] Undefined reference to vtable with -O2 and -fdevirtualize-speculatively I think this one can be waved away as invalid code though tracking types with visibility default attached to it may be a safe fix and may make transition to GCC 5 smoother. I do not think it should be P1 Bug 65475 - [5 Regression] ICE in odr_vtable_hasher::equal (segmentation fault) Segfault is fixed (so no longer P1 IMO) I want to tamn down the verbosity of ODR violation warnings at weekend (we probably do not want more than one warning per type) Bug 64860 - [5 Regression] multiple definition of typeinfo in 5.0 (4.9.2 works)) This is about incremental linking with LTO but without Andi's patches to make it actually work. I think we can fix the linker errors eaisly (will do so soon), but I do not see much utility in it. Bug 61886 - [4.8/4.9/5 Regression] LTO breaks fread with _FORTIFY_SOURCE=2 This one I think may need to wait for 5.2. I have fix in works and it is not too intrusiove so far. My main concern is about getting aliases more commonly appearing and exposing semi-latent bugs. We did a lot of progress making them work thanks to ICF making aliases mainstream. Main issue IMO is with alias analysis (that is not hit by ICF), but to fix 61886 we may want to introduce the transparent aliases only for function symbols that is IMO quite safe. Once I am done I will post full patch for RMs to consider. We still have some inliner performance issues: - I analyzed bzip, crafty and regressions https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65478 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65483 The first one is easy to fix at ipa-cp side that I hope Martin will look into - Igor reports eon regression I plan to analyze on weekend - tramp3d is about 10% above GCC 4.9 codesize - I think this is fine. trakced in: Bug 65076 - [5 Regression] 16% tramp3d-v4.cpp compile time regression There are two parameter tweeks I am considering (and running benchamrks over weekend): - We may run into issues with current 15% unit growth is bit too tight. I am planning to re-benchmark firefox and see if bumping to 20% makes useful difference. With all bugfixes in area it may not - -fprofile-use inlining has currently no bound on size of function to inline when call is considered resonably hot. This makes me bit nervous and I may limit it by inline-insns-single without LTO. (with LTO overall unit growth works well) Chromium and Firefox has some ODR violation warnings that may need a bit of love. I think there are some that are just invalid (need to double check) and we get some unwanted verbosity here too. ICF at Chromium needs 15% of WPA time (and saves 10% size in addition to gold ICF). Martin has patches to improve hashing, one was approved the other (adding simple hash for types) will be hopefully done by Monday. Honza
Re: [debug-early] Handle specification of class scoped static functions
I think we want to drop the debug_early check there entirely; the added conditions seem to be gutting it. If is_cu_die (old_die->die_parent) is false, then class_or_namespace_scope_p (old_die->die_parent) ought to be true. Jason
Re: C++ PATCH for c++/65398 (valid constexpr rejected) (take 2)
On 03/20/2015 05:03 PM, Jakub Jelinek wrote: Hi! Just for completeness: On Fri, Mar 20, 2015 at 09:56:52PM +0100, Marek Polacek wrote: +constexpr char d1 = *(&s[4] - 0); // { dg-error "array subscript out of bound" } +constexpr char d2 = *(&s[4] - 1); +constexpr char d3 = *(&s[4] - 2); +constexpr char d4 = *(&s[4] - 3); +constexpr char d5 = *(&s[4] - 4); +constexpr char d6 = *(&s[4] - 5); // { dg-error "negative array subscript" } + +/* Don't accept invalid stuff. */ +constexpr char e1 = *(&s[5] - 1); // { dg-error "is not a constant expression" } +constexpr char e2 = *(&s[5] - 2); // { dg-error "is not a constant expression" } +constexpr char e3 = *(&s[5] - 3); // { dg-error "is not a constant expression" } + +SA (c1 == 'a'); +SA (c2 == 'b'); +SA (c3 == 'b'); +SA (c4 == 'c'); +SA (c5 == 'c'); +SA (c6 == 'c'); +SA (c7 == '\0'); Miss SA here for d2-d5. +constexpr char j1 = *(&l[4] - 0); // { dg-error "array subscript out of bound" } +constexpr char j2 = *(&l[4] - 1); +constexpr char j3 = *(&l[4] - 2); +constexpr char j4 = *(&l[4] - 3); +constexpr char j5 = *(&l[4] - 4); +constexpr char j6 = *(&l[4] - 5); // { dg-error "negative array subscript" } +SA (i1 == 'c'); +SA (i2 == 'd'); +SA (i3 == 'd'); +SA (i4 == 'e'); +SA (i5 == 'e'); +SA (i6 == 'e'); +SA (i7 == '\0'); And SA here for j2-j5. OK with these changes. Jason
Re: [PATCH] Fix __has_{cpp_}attribute with -traditional-cpp (PR preprocessor/65238)
On 03/20/2015 01:23 PM, Jakub Jelinek wrote: Well, but the traditional preprocessor behavior in the end is not in not expanding macro arguments, normally they actually are expanded, but not immediately, but because of trying to preprocess the result of preprocessing again and again. When traditionally preprocessing #if/#elif etc. directives, the directive line is preprocessed until no macros are present and only then fed to the ISO preprocessor to evaluate and handle the directive. The problem with the builtin function-like macros is that it needs the argument immediately. Ah, I see, thanks. Please add some of this rationale to the patch. OK with that. Jason
Re: C++ PATCH for c++/65398 (valid constexpr rejected) (take 2)
Hi! Just for completeness: On Fri, Mar 20, 2015 at 09:56:52PM +0100, Marek Polacek wrote: > +constexpr char d1 = *(&s[4] - 0); // { dg-error "array subscript out of > bound" } > +constexpr char d2 = *(&s[4] - 1); > +constexpr char d3 = *(&s[4] - 2); > +constexpr char d4 = *(&s[4] - 3); > +constexpr char d5 = *(&s[4] - 4); > +constexpr char d6 = *(&s[4] - 5); // { dg-error "negative array subscript" } > + > +/* Don't accept invalid stuff. */ > +constexpr char e1 = *(&s[5] - 1); // { dg-error "is not a constant > expression" } > +constexpr char e2 = *(&s[5] - 2); // { dg-error "is not a constant > expression" } > +constexpr char e3 = *(&s[5] - 3); // { dg-error "is not a constant > expression" } > + > +SA (c1 == 'a'); > +SA (c2 == 'b'); > +SA (c3 == 'b'); > +SA (c4 == 'c'); > +SA (c5 == 'c'); > +SA (c6 == 'c'); > +SA (c7 == '\0'); Miss SA here for d2-d5. > +constexpr char j1 = *(&l[4] - 0); // { dg-error "array subscript out of > bound" } > +constexpr char j2 = *(&l[4] - 1); > +constexpr char j3 = *(&l[4] - 2); > +constexpr char j4 = *(&l[4] - 3); > +constexpr char j5 = *(&l[4] - 4); > +constexpr char j6 = *(&l[4] - 5); // { dg-error "negative array subscript" } > +SA (i1 == 'c'); > +SA (i2 == 'd'); > +SA (i3 == 'd'); > +SA (i4 == 'e'); > +SA (i5 == 'e'); > +SA (i6 == 'e'); > +SA (i7 == '\0'); And SA here for j2-j5. Jakub
Re: [PATCH] Make split_block and create_basic_block type-safe
On Thu, 2015-03-12 at 14:20 +0100, Richard Biener wrote: > After noticing tree-parloop.c passing crap to split_block (a tree > rather than a gimple or an rtx) I noticed those CFG functions simply > take void * pointers. The following patch fixes that and adds > two overloads, one for GIMPLE use and one for RTL use. > > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. > > Ok at this stage? > > Thanks, > Richard. > > 2015-03-12 Richard Biener > > * cfghooks.h (create_basic_block): Replace with two overloads > for RTL and GIMPLE. > (split_block): Likewise. > * cfghooks.c (split_block): Rename to ... > (split_block_1): ... this. > (split_block): Add two type-safe overloads for RTL and GIMPLE. > (split_block_after_labels): Call split_block_1. > (create_basic_block): Rename to ... > (create_basic_block_1): ... this. > (create_basic_block): Add two type-safe overloads for RTL and GIMPLE. > (create_empty_bb): Call create_basic_block_1. > * cfgrtl.c (fixup_fallthru_exit_predecessor): Use > split_block_after_labels. > * omp-low.c (expand_parallel_call): Likewise. > (expand_omp_target): Likewise. > (simd_clone_adjust): Likewise. > * tree-chkp.c (chkp_get_entry_block): Likewise. > * cgraphunit.c (init_lowered_empty_function): Use the GIMPLE > create_basic_block overload. > (cgraph_node::expand_thunk): Likewise. > * tree-cfg.c (make_blocks): Likewise. > (handle_abnormal_edges): Likewise. > * tree-inline.c (copy_bb): Likewise. > > Index: gcc/cfghooks.c > === > --- gcc/cfghooks.c(revision 221379) > +++ gcc/cfghooks.c(working copy) [...snip...] > +edge > +split_block (basic_block bb, rtx i) > +{ > + return split_block_1 (bb, i); > +} Possibly a dumb question, but could this take an rtx_insn * rather than a plain rtx? > +basic_block > +create_basic_block (rtx head, rtx end, basic_block after) > +{ > + return create_basic_block_1 (head, end, after); > +} Likewise for head and end... though I see a fix would be needed in bfin.c:hwloop_optimize, at least. > Index: gcc/cfghooks.h > === > --- gcc/cfghooks.h(revision 221379) > +++ gcc/cfghooks.h(working copy) > -extern edge split_block (basic_block, void *); > +extern edge split_block (basic_block, rtx); (etc) > -extern basic_block create_basic_block (void *, void *, basic_block); > +extern basic_block create_basic_block (rtx, rtx, basic_block); (etc)
Re: C++ PATCH for c++/65398 (valid constexpr rejected) (take 2)
On Fri, Mar 20, 2015 at 10:53:50AM -0400, Jason Merrill wrote: > On 03/19/2015 02:05 PM, Jakub Jelinek wrote: > >Though, a question is if we do (or, if we don't and should) reject say > >constexpr char s[] = "abc"; > >constexpr int j = 4; > >constexpr char c = *(&s[j] - 2); > >because there was out of bound access in there. > > I don't see an out-of-bound access in this example; taking the address of > one-past-the-end is OK as long as you don't try to access through it. > > >Unfortunately we reject even that (regardless the patch), and yeah, it's > >because of how POINTER_PLUS_EXPR uses sizetype as a type of the second > >operand. > > This seems like something to fix in this patch. In the following version of the patch I tried to address various diagnostic issues raised in this thread. Please see the testcase if what I did makes sense. > >+ tree t = fold_convert_loc (loc, TREE_TYPE (op01), > >+ TREE_OPERAND (op00, 1)); > >+ t = size_binop_loc (loc, PLUS_EXPR, op01, t); > > This seems to be assuming that the elements are size 1. Dunno how I flubbed that. Fixed. Bootstrapped/regtested on x86_64-linux, ok for trunk? 2015-03-20 Marek Polacek PR c++/65398 * constexpr.c (cxx_fold_indirect_ref): Transform *(&A[i] p+ j) into A[i + j]. * g++.dg/cpp0x/pr65398.C: New test. diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c index 1b5f50c..37b619d 100644 --- gcc/cp/constexpr.c +++ gcc/cp/constexpr.c @@ -2427,6 +2427,27 @@ cxx_fold_indirect_ref (location_t loc, tree type, tree op0, bool *empty_base) break; } } + /* *(&A[i] p+ j) => A[i + j] */ + else if (TREE_CODE (op00) == ARRAY_REF + && TREE_CODE (TREE_OPERAND (op00, 1)) == INTEGER_CST + && TREE_CODE (op01) == INTEGER_CST) + { + tree t = fold_convert_loc (loc, ssizetype, +TREE_OPERAND (op00, 1)); + tree nelts + = array_type_nelts_top (TREE_TYPE (TREE_OPERAND (op00, 0))); + /* Don't fold an out-of-bound access. */ + if (!tree_int_cst_le (t, nelts)) + return NULL_TREE; + /* Make sure to treat the second operand of POINTER_PLUS_EXPR +as signed. */ + op01 = fold_build2_loc (loc, EXACT_DIV_EXPR, ssizetype, + cp_fold_convert (ssizetype, op01), + TYPE_SIZE_UNIT (type)); + t = size_binop_loc (loc, PLUS_EXPR, op01, t); + return build4_loc (loc, ARRAY_REF, type, TREE_OPERAND (op00, 0), +t, NULL_TREE, NULL_TREE); + } } } /* *(foo *)fooarrptr => (*fooarrptr)[0] */ diff --git gcc/testsuite/g++.dg/cpp0x/pr65398.C gcc/testsuite/g++.dg/cpp0x/pr65398.C index e69de29..4aa7455 100644 --- gcc/testsuite/g++.dg/cpp0x/pr65398.C +++ gcc/testsuite/g++.dg/cpp0x/pr65398.C @@ -0,0 +1,62 @@ +// PR c++/65398 +// { dg-do compile { target c++11 } } + +#define SA(X) static_assert((X),#X) + +constexpr char s[] = "abc"; +constexpr char c1 = *(&s[0] + 0); +constexpr char c2 = *(&s[0] + 1); +constexpr char c3 = *(&s[1] + 0); +constexpr char c4 = *(&s[1] + 1); +constexpr char c5 = *(&s[2] + 0); +constexpr char c6 = *(&s[0] + 2); +constexpr char c7 = *(&s[2] + 1); + +constexpr char d1 = *(&s[4] - 0); // { dg-error "array subscript out of bound" } +constexpr char d2 = *(&s[4] - 1); +constexpr char d3 = *(&s[4] - 2); +constexpr char d4 = *(&s[4] - 3); +constexpr char d5 = *(&s[4] - 4); +constexpr char d6 = *(&s[4] - 5); // { dg-error "negative array subscript" } + +/* Don't accept invalid stuff. */ +constexpr char e1 = *(&s[5] - 1); // { dg-error "is not a constant expression" } +constexpr char e2 = *(&s[5] - 2); // { dg-error "is not a constant expression" } +constexpr char e3 = *(&s[5] - 3); // { dg-error "is not a constant expression" } + +SA (c1 == 'a'); +SA (c2 == 'b'); +SA (c3 == 'b'); +SA (c4 == 'c'); +SA (c5 == 'c'); +SA (c6 == 'c'); +SA (c7 == '\0'); + +constexpr int l[] = { 'c', 'd', 'e', '\0' }; +constexpr int i1 = *(&l[0] + 0); +constexpr int i2 = *(&l[0] + 1); +constexpr int i3 = *(&l[1] + 0); +constexpr int i4 = *(&l[1] + 1); +constexpr int i5 = *(&l[2] + 0); +constexpr int i6 = *(&l[0] + 2); +constexpr int i7 = *(&l[2] + 1); + +constexpr char j1 = *(&l[4] - 0); // { dg-error "array subscript out of bound" } +constexpr char j2 = *(&l[4] - 1); +constexpr char j3 = *(&l[4] - 2); +constexpr char j4 = *(&l[4] - 3); +constexpr char j5 = *(&l[4] - 4); +constexpr char j6 = *(&l[4] - 5); // { dg-error "negative array subscript" } + +/* Don't accept invalid stuff. */ +constexpr char k1 = *(&l[5] - 1); // { dg-error "is not a constant expression" } +constexpr char k2 = *(&l[5] - 2); // { dg-error "is not a constant expression" } +constexpr char k3 = *(&l[5] - 3); // { dg-erro
Re: [PATCH, rs6000, testsuite] Fix PR target/64579, __TM_end __builtin_tend failed to return transactional state
Hi Peter, Nice patch. Some minor things... On Fri, Mar 20, 2015 at 02:27:40PM -0500, Peter Bergner wrote: > Index: gcc/config/rs6000/htm.md > === > --- gcc/config/rs6000/htm.md (revision 221519) > +++ gcc/config/rs6000/htm.md (working copy) > @@ -48,24 +48,19 @@ (define_c_enum "unspecv" > > > (define_expand "tabort" > - [(set (match_dup 2) > - (unspec_volatile:CC [(match_operand:SI 1 "int_reg_operand" "")] > - UNSPECV_HTM_TABORT)) > - (set (match_dup 3) > - (eq:SI (match_dup 2) > -(const_int 0))) > - (set (match_operand:SI 0 "int_reg_operand" "") > - (xor:SI (match_dup 3) > - (const_int 1)))] > + [(match_operand:SI 0 "gpc_reg_operand" "") > + (match_operand:SI 1 "gpc_reg_operand" "")] >"TARGET_HTM" > { > - operands[2] = gen_rtx_REG (CCmode, CR0_REGNO); > - operands[3] = gen_reg_rtx (SImode); > + rtx cr = gen_reg_rtx (CCmode); > + emit_insn (gen_tabort_internal (operands[1], cr)); > + rs6000_emit_move_from_cr_field (operands[0], cr); > + DONE; > }) Maybe it would be nicer if the builtin-expansion code handled the copy from cc, instead of stacking on RTL expanders. > (define_expand "tabortdc" > - [(set (match_dup 4) > - (unspec_volatile:CC [(match_operand 1 "u5bit_cint_operand" "n") > - (match_operand:SI 2 "gpc_reg_operand" "r") > - (match_operand:SI 3 "gpc_reg_operand" "r")] > - UNSPECV_HTM_TABORTDC)) > - (set (match_dup 5) > - (eq:SI (match_dup 4) > -(const_int 0))) > - (set (match_operand:SI 0 "int_reg_operand" "") > - (xor:SI (match_dup 5) > - (const_int 1)))] > + [(match_operand:SI 0 "gpc_reg_operand" "") > + (match_operand 1 "u5bit_cint_operand" "n") > + (match_operand:SI 2 "gpc_reg_operand" "") > + (match_operand:SI 3 "gpc_reg_operand" "")] >"TARGET_HTM" > { > - operands[4] = gen_rtx_REG (CCmode, CR0_REGNO); > - operands[5] = gen_reg_rtx (SImode); > + rtx cr = gen_reg_rtx (CCmode); > + emit_insn (gen_tabortdc_internal (operands[1], operands[2], operands[3], > cr)); > + rs6000_emit_move_from_cr_field (operands[0], cr); > + DONE; > }) Expanders have no constraints (you can leave out the field completely). Doesn't gen* warn on non-empty constraints? > --- gcc/testsuite/gcc.target/powerpc/htm-1.c (revision 0) > +++ gcc/testsuite/gcc.target/powerpc/htm-1.c (working copy) > @@ -0,0 +1,53 @@ > +/* { dg-do run { target { powerpc*-*-* && htm_hw } } } */ > +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ htm_hw already disallows Darwin? [ And {"*"} {""} is default. ] > + # For now, disable on Darwin > + if { [istarget powerpc-*-eabi] || [istarget powerpc*-*-eabispe] || > [istarget *-*-darwin*]} { > + expr 0 If ever HTM is enabled on Darwin, the testcases should Just Work as far as I see. Cheers, Segher
Re: [GOOGLE] Fixes for TLS wrapper and init functions in LIPO mode
ok thanks. David On Fri, Mar 20, 2015 at 1:40 PM, Teresa Johnson wrote: > After offline discussions with David, I changed to the approach of > generating __tls_init for the aux module, but then modifying the LIPO > code to allow it to be statically promoted and treated as an aux > function as a special case of an artificial decl. That works well. New > patch below. Fixed the tests to validate rather than print the > expected output. > > Ok for google branches? > > 2015-03-20 Teresa Johnson > > gcc/ > Google ref b/19618364. > * cp/decl2.c (get_local_tls_init_fn): Mark static, > record new global at module scope. > (get_tls_init_fn): Record new global at module scope. > (get_tls_wrapper_fn): Ditto. > (handle_tls_init): Suppress alias in aux module. > * l-ipo.c (decl_artificial_nopromote): New function. > (cgraph_is_aux_decl_external): Call new function. > (process_module_scope_static_func): Ditto. > > testsuite/ > Google ref b/19618364. > * testsuite/g++.dg/tree-prof/lipo/tls.h: New test. > * testsuite/g++.dg/tree-prof/lipo/tls2.h: Ditto. > * testsuite/g++.dg/tree-prof/lipo/tls2_0.C: Ditto. > * testsuite/g++.dg/tree-prof/lipo/tls2_1.C: Ditto. > * testsuite/g++.dg/tree-prof/lipo/tls_0.C: Ditto. > * testsuite/g++.dg/tree-prof/lipo/tls_1.C: Ditto. > > Index: cp/decl2.c > === > --- cp/decl2.c (revision 221425) > +++ cp/decl2.c (working copy) > @@ -3033,9 +3033,15 @@ get_local_tls_init_fn (void) > void_list_node)); >SET_DECL_LANGUAGE (fn, lang_c); >TREE_PUBLIC (fn) = false; > + TREE_STATIC (fn) = true; >DECL_ARTIFICIAL (fn) = true; >mark_used (fn); >SET_IDENTIFIER_GLOBAL_VALUE (sname, fn); > + /* In LIPO mode make sure we record the new global value so that it > + is cleared before parsing the next aux module. */ > + if (L_IPO_COMP_MODE && !is_parsing_done_p ()) > +add_decl_to_current_module_scope (fn, > + NAMESPACE_LEVEL > (global_namespace)); > } >return fn; > } > @@ -3100,6 +3106,11 @@ get_tls_init_fn (tree var) >DECL_BEFRIENDING_CLASSES (fn) = var; > >SET_IDENTIFIER_GLOBAL_VALUE (sname, fn); > + /* In LIPO mode make sure we record the new global value so that it > + is cleared before parsing the next aux module. */ > + if (L_IPO_COMP_MODE && !is_parsing_done_p ()) > +add_decl_to_current_module_scope (fn, > + NAMESPACE_LEVEL > (global_namespace)); > } >return fn; > } > @@ -3157,6 +3168,11 @@ get_tls_wrapper_fn (tree var) >DECL_BEFRIENDING_CLASSES (fn) = var; > >SET_IDENTIFIER_GLOBAL_VALUE (sname, fn); > + /* In LIPO mode make sure we record the new global value so that it > + is cleared before parsing the next aux module. */ > + if (L_IPO_COMP_MODE && !is_parsing_done_p ()) > +add_decl_to_current_module_scope (fn, > + NAMESPACE_LEVEL > (global_namespace)); > } >return fn; > } > @@ -4213,8 +4229,12 @@ handle_tls_init (void) >one_static_initialization_or_destruction (var, init, true); > > #ifdef ASM_OUTPUT_DEF > - /* Output init aliases even with -fno-extern-tls-init. */ > - if (TREE_PUBLIC (var)) > + /* Output init aliases even with -fno-extern-tls-init. Don't emit > + aliases in LIPO aux modules, since the corresponding __tls_init > + will be static promoted and deleted, so the variable's tls init > + function will be resolved by its own primary module. An alias > + would prevent the promoted aux __tls_init from being deleted. */ > + if (TREE_PUBLIC (var) && !L_IPO_IS_AUXILIARY_MODULE) > { >tree single_init_fn = get_tls_init_fn (var); > if (single_init_fn == NULL_TREE) > Index: l-ipo.c > === > --- l-ipo.c (revision 221425) > +++ l-ipo.c (working copy) > @@ -1120,6 +1120,27 @@ cgraph_unify_type_alias_sets (void) >htab_delete (type_hash_tab); > } > > +/* Return true if DECL is an artificial function that we do not want > + to promote and which may not be available in the primary module. > + The sole exception is currently __tls_init. */ > + > +static bool > +decl_artificial_nopromote (tree decl) > +{ > + if (!DECL_ARTIFICIAL (decl)) > +return false; > + > + /* Handle the __tls_init function specially as we do want to promote it and > + allow the aux module to be resolved to the version in the primary > module. > + We check if it is prefixed by __tls_init to catch it after promotion > + as well from cgraph_is_aux_decl_external. */ > + if (!strncmp (
Re: [GOOGLE] Fixes for TLS wrapper and init functions in LIPO mode
After offline discussions with David, I changed to the approach of generating __tls_init for the aux module, but then modifying the LIPO code to allow it to be statically promoted and treated as an aux function as a special case of an artificial decl. That works well. New patch below. Fixed the tests to validate rather than print the expected output. Ok for google branches? 2015-03-20 Teresa Johnson gcc/ Google ref b/19618364. * cp/decl2.c (get_local_tls_init_fn): Mark static, record new global at module scope. (get_tls_init_fn): Record new global at module scope. (get_tls_wrapper_fn): Ditto. (handle_tls_init): Suppress alias in aux module. * l-ipo.c (decl_artificial_nopromote): New function. (cgraph_is_aux_decl_external): Call new function. (process_module_scope_static_func): Ditto. testsuite/ Google ref b/19618364. * testsuite/g++.dg/tree-prof/lipo/tls.h: New test. * testsuite/g++.dg/tree-prof/lipo/tls2.h: Ditto. * testsuite/g++.dg/tree-prof/lipo/tls2_0.C: Ditto. * testsuite/g++.dg/tree-prof/lipo/tls2_1.C: Ditto. * testsuite/g++.dg/tree-prof/lipo/tls_0.C: Ditto. * testsuite/g++.dg/tree-prof/lipo/tls_1.C: Ditto. Index: cp/decl2.c === --- cp/decl2.c (revision 221425) +++ cp/decl2.c (working copy) @@ -3033,9 +3033,15 @@ get_local_tls_init_fn (void) void_list_node)); SET_DECL_LANGUAGE (fn, lang_c); TREE_PUBLIC (fn) = false; + TREE_STATIC (fn) = true; DECL_ARTIFICIAL (fn) = true; mark_used (fn); SET_IDENTIFIER_GLOBAL_VALUE (sname, fn); + /* In LIPO mode make sure we record the new global value so that it + is cleared before parsing the next aux module. */ + if (L_IPO_COMP_MODE && !is_parsing_done_p ()) +add_decl_to_current_module_scope (fn, + NAMESPACE_LEVEL (global_namespace)); } return fn; } @@ -3100,6 +3106,11 @@ get_tls_init_fn (tree var) DECL_BEFRIENDING_CLASSES (fn) = var; SET_IDENTIFIER_GLOBAL_VALUE (sname, fn); + /* In LIPO mode make sure we record the new global value so that it + is cleared before parsing the next aux module. */ + if (L_IPO_COMP_MODE && !is_parsing_done_p ()) +add_decl_to_current_module_scope (fn, + NAMESPACE_LEVEL (global_namespace)); } return fn; } @@ -3157,6 +3168,11 @@ get_tls_wrapper_fn (tree var) DECL_BEFRIENDING_CLASSES (fn) = var; SET_IDENTIFIER_GLOBAL_VALUE (sname, fn); + /* In LIPO mode make sure we record the new global value so that it + is cleared before parsing the next aux module. */ + if (L_IPO_COMP_MODE && !is_parsing_done_p ()) +add_decl_to_current_module_scope (fn, + NAMESPACE_LEVEL (global_namespace)); } return fn; } @@ -4213,8 +4229,12 @@ handle_tls_init (void) one_static_initialization_or_destruction (var, init, true); #ifdef ASM_OUTPUT_DEF - /* Output init aliases even with -fno-extern-tls-init. */ - if (TREE_PUBLIC (var)) + /* Output init aliases even with -fno-extern-tls-init. Don't emit + aliases in LIPO aux modules, since the corresponding __tls_init + will be static promoted and deleted, so the variable's tls init + function will be resolved by its own primary module. An alias + would prevent the promoted aux __tls_init from being deleted. */ + if (TREE_PUBLIC (var) && !L_IPO_IS_AUXILIARY_MODULE) { tree single_init_fn = get_tls_init_fn (var); if (single_init_fn == NULL_TREE) Index: l-ipo.c === --- l-ipo.c (revision 221425) +++ l-ipo.c (working copy) @@ -1120,6 +1120,27 @@ cgraph_unify_type_alias_sets (void) htab_delete (type_hash_tab); } +/* Return true if DECL is an artificial function that we do not want + to promote and which may not be available in the primary module. + The sole exception is currently __tls_init. */ + +static bool +decl_artificial_nopromote (tree decl) +{ + if (!DECL_ARTIFICIAL (decl)) +return false; + + /* Handle the __tls_init function specially as we do want to promote it and + allow the aux module to be resolved to the version in the primary module. + We check if it is prefixed by __tls_init to catch it after promotion + as well from cgraph_is_aux_decl_external. */ + if (!strncmp (IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)), "__tls_init", +10)) +return false; + + return true; +} + /* Return true if NODE->decl from an auxiliary module has external definition (and therefore is not needed for expansion). */ @@ -1144,7 +1165,7 @@ cgraph_is_aux_decl_exte
[PATCH, rs6000, testsuite] Fix PR target/64579, __TM_end __builtin_tend failed to return transactional state
PR target/64579 exposes a bug in the definitions of the HTM builtins and associated define_expands. All of the HTM builtins were defined to return true on success and false on failure, but that only makes sense for the tbegin builtin (which is unchanged here). It makes more sense for the other builtins to just return the contents of the CR register (usually cr0) that is updated upon execution of the HTM instruction instead, since it contains the Transaction Status bits. This makes them similar to the __builtin_ttest() builtin which already does that. I have also updated the gcc documentation to reflect these changes. I believe this redefinition of the builtins is "safe", since for the builtins I'm redefining, the common usage case is to just ignore the builtin return result altogether. This is proved by the recent fix by Adhemerval that fixed the tcheck pattern to not be a recormd form instruction. Had anyone actually used the builtin, the assembler would have flagged it as invalid. As a result of these changes, the __TM_end() XL intrinsic function which was the topic of this bugzilla, now correctly returns the transaction status per the XL HTM API. I'll also note Adhemerval submitted a patch for LLVM to add HTM support for these HTM builtins and they match the changes I have made here. The XL compiler doesn't support our low level HTM builtins, so there's problems there. I have also added an executable HTM test case that tests the tbegin, tend, tsuspend, tresume and tcheck builtins are compiled, assembled and execute correctly. This passed bootstrapping and regtesting with no regressions. Is this ok for trunk (stage1 or stage4)? I'd like to also backport this to the release branches so that all of the branches agree on the builtin return values. Is that ok? Peter gcc/: PR target/64579 * config/rs6000/htm.md (tabort, tabortdc, tabortdci, tabortwc, tabortwci, ttest, tcheck, tend, trechkpt, treclaim, tsr): Use rs6000_emit_move_from_cr_field. (tbegin, tabort_internal): Use gpc_reg_operand. (tcheck_internal): Remove operand. (tabort_internal, tabortdc_internal, tabortdci_internal, tabortwc_internal, tabortwci_internal, ttest_internal, tcheck_internal, tend_internal, trechkpt_internal, treclaim_internal, tsr_internal): Remove * prefix from existing define_insn names. * config/rs6000/rs6000-builtin.def (tcheck): Remove builtin argument. * config/rs6000/rs6000-protos.h (rs6000_emit_move_from_cr_field): New. * config/rs6000/rs6000.c (rs6000_emit_move_from_cr_field): Likewise. * config/rs6000/rs6000.h (RS6000_BTC_32BIT): Delete. (RS6000_BTC_64BIT): Likewise. (RS6000_BTC_MISC_MASK): Update. * config/rs6000/htmxlintrin.h (__TM_end): Use _HTM_TRANSACTIONAL as expected value. * doc/extend.texi: Update documentation for htm builtins. gcc/testsuite/: PR target/64579 * gcc.target/powerpc/htm-builtin-1.c (__builtin_tcheck): Remove operand. * gcc.target/powerpc/htm-1.c: New test. * lib/target-supports.exp (check_htm_hw_available): New function. Index: gcc/config/rs6000/htm.md === --- gcc/config/rs6000/htm.md(revision 221519) +++ gcc/config/rs6000/htm.md(working copy) @@ -48,24 +48,19 @@ (define_c_enum "unspecv" (define_expand "tabort" - [(set (match_dup 2) - (unspec_volatile:CC [(match_operand:SI 1 "int_reg_operand" "")] - UNSPECV_HTM_TABORT)) - (set (match_dup 3) - (eq:SI (match_dup 2) - (const_int 0))) - (set (match_operand:SI 0 "int_reg_operand" "") - (xor:SI (match_dup 3) - (const_int 1)))] + [(match_operand:SI 0 "gpc_reg_operand" "") + (match_operand:SI 1 "gpc_reg_operand" "")] "TARGET_HTM" { - operands[2] = gen_rtx_REG (CCmode, CR0_REGNO); - operands[3] = gen_reg_rtx (SImode); + rtx cr = gen_reg_rtx (CCmode); + emit_insn (gen_tabort_internal (operands[1], cr)); + rs6000_emit_move_from_cr_field (operands[0], cr); + DONE; }) -(define_insn "*tabort_internal" +(define_insn "tabort_internal" [(set (match_operand:CC 1 "cc_reg_operand" "=x") - (unspec_volatile:CC [(match_operand:SI 0 "int_reg_operand" "r")] + (unspec_volatile:CC [(match_operand:SI 0 "gpc_reg_operand" "r")] UNSPECV_HTM_TABORT))] "TARGET_HTM" "tabort. %0" @@ -73,24 +68,19 @@ (define_insn "*tabort_internal" (set_attr "length" "4")]) (define_expand "tabortdc" - [(set (match_dup 4) - (unspec_volatile:CC [(match_operand 1 "u5bit_cint_operand" "n") -(match_operand:SI 2 "gpc_reg_operand" "r") -(match_operand:SI 3 "gpc_reg_operand" "r")] - UNSPECV_HTM_TABORTDC)) - (set (match_dup 5) - (eq:SI (match_dup 4) - (const_int 0))) - (set (match_o
Re: [Patch, Fortran] Extend (lib)coarray API/ABI documentation
On 03/10/2015 02:59 PM, Tobias Burnus wrote: This patch completes the description of the coarray library functions, invoked for -fcoarray=lib. OK for the trunk? (The currently documented functions can be seen at https://gcc.gnu.org/onlinedocs/gfortran/Coarray-Programming.html ) Tobias OK, thanks. Jerry
Re: [Patch, Fortran] Reject unsupported coarray communication
On 03/12/2015 12:18 AM, Tobias Burnus wrote: There are two groups of features which are not properly implemented with remote access: * "caf(:)[i]%a" might have a byte stride which is not compatible with the size of "a". (Fix: new array descriptor.) * All access which involves dereferencing pointers in a remote coarray (e.g. "caf[i]%ptr_comp = 5") are not supported. This patch now rejects them - instead of accepting them silently and doing the wrong things at runtime. Build and regtested on x86-64-gnu-linux OK for the trunk? Tobias OK for trunk. Jerry
Re: Fix PR 65177: diamonds are not valid execution threads for jump threading
On 03/20/2015 12:17 PM, Sebastian Pop wrote: Richard Biener wrote: On Thu, Mar 19, 2015 at 8:54 PM, Sebastian Pop wrote: Richard Biener wrote: please instead fixup after copy_bbs in duplicate_seme_region. Thanks for the review. Attached patch that does not modify copy_bbs. Fixes make check in hmmer and make check RUNTESTFLAGS=tree-ssa.exp Full bootstrap and regtest in progress on x86_64-linux. Ok for trunk? Looks good to me. Please also add a testcase. Attached. I will wait to commit until I have a green light from Jeff. Sorry, dealing with some non-technical stuff right now. I should have time to look at this over the weekend. jeff
Re: Fix PR 65177: diamonds are not valid execution threads for jump threading
Richard Biener wrote: > On Thu, Mar 19, 2015 at 8:54 PM, Sebastian Pop wrote: > > Richard Biener wrote: > >> please instead fixup after copy_bbs in duplicate_seme_region. > >> > > > > Thanks for the review. > > Attached patch that does not modify copy_bbs. > > Fixes make check in hmmer and make check RUNTESTFLAGS=tree-ssa.exp > > > > Full bootstrap and regtest in progress on x86_64-linux. Ok for trunk? > > Looks good to me. Please also add a testcase. Attached. I will wait to commit until I have a green light from Jeff. Sebastian >From f57f9a44717406cd148b3f432565654c36b33f2f Mon Sep 17 00:00:00 2001 From: Sebastian Pop Date: Fri, 20 Mar 2015 18:12:47 +0100 Subject: [PATCH] diamonds are not valid execution threads for jump threading PR tree-optimization/65177 * tree-ssa-threadupdate.c (verify_seme): Renamed verify_jump_thread. (bb_in_bbs): New. (duplicate_seme_region): Renamed duplicate_thread_path. Redirect all edges not adjacent on the path to the original code. * gcc.dg/tree-ssa/ssa-dom-thread-10.c: New. --- gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-10.c | 24 ++ gcc/tree-ssa-threadupdate.c | 93 +++-- 2 files changed, 91 insertions(+), 26 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-10.c diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-10.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-10.c new file mode 100644 index 000..4acf580 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-10.c @@ -0,0 +1,24 @@ +/* PR 65177 */ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + +typedef struct p7_profile_s {} P7_PROFILE; +enum p7t_statetype_e { + p7T_S = 4, p7T_N = 5, p7T_E = 7, p7T_C = 8, p7T_J = 10, }; +typedef struct p7_trace_s {} P7_TRACE; +typedef struct p7_gmx_s { + int L; +} P7_GMX; +static inline int select_c(const P7_PROFILE *gm, const P7_GMX *pp, const P7_GMX *gx, int i) { + float path[2]; + return ((path[0] > path[1]) ? p7T_C : p7T_E); +} +void p7_GOATrace(const P7_PROFILE *gm, const P7_GMX *pp, const P7_GMX *gx, P7_TRACE *tr) { + int i = gx->L; + int sprv, scur; + while (sprv != p7T_S) { +switch (sprv) { case p7T_C: scur = select_c(gm, pp, gx, i); break; } +if ( (scur == p7T_N || scur == p7T_J || scur == p7T_C) && scur == sprv) i--; +sprv = scur; + } +} diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c index 7a159bb..610e807 100644 --- a/gcc/tree-ssa-threadupdate.c +++ b/gcc/tree-ssa-threadupdate.c @@ -2328,36 +2328,32 @@ bb_ends_with_multiway_branch (basic_block bb ATTRIBUTE_UNUSED) return false; } -/* Verify that the REGION is a Single Entry Multiple Exits region: make sure no - edge other than ENTRY is entering the REGION. */ +/* Verify that the REGION is a valid jump thread. A jump thread is a special + case of SEME Single Entry Multiple Exits region in which all nodes in the + REGION have exactly one incoming edge. The only exception is the first block + that may not have been connected to the rest of the cfg yet. */ DEBUG_FUNCTION void -verify_seme (edge entry, basic_block *region, unsigned n_region) +verify_jump_thread (basic_block *region, unsigned n_region) { - bitmap bbs = BITMAP_ALLOC (NULL); - for (unsigned i = 0; i < n_region; i++) -bitmap_set_bit (bbs, region[i]->index); +gcc_assert (EDGE_COUNT (region[i]->preds) <= 1); +} - for (unsigned i = 0; i < n_region; i++) -{ - edge e; - edge_iterator ei; - basic_block bb = region[i]; +/* Return true when BB is one of the first N items in BBS. */ - /* All predecessors other than ENTRY->src should be in the region. */ - for (ei = ei_start (bb->preds); (e = ei_safe_edge (ei)); ei_next (&ei)) - if (e != entry) - gcc_assert (bitmap_bit_p (bbs, e->src->index)); -} +static inline bool +bb_in_bbs (basic_block bb, basic_block *bbs, int n) +{ + for (int i = 0; i < n; i++) +if (bb == bbs[i]) + return true; - BITMAP_FREE (bbs); + return false; } -/* Duplicates a Single Entry Multiple Exit REGION (set of N_REGION basic - blocks). The ENTRY edge is redirected to the duplicate of the region. If - REGION is not a Single Entry region, ignore any incoming edges other than - ENTRY: this makes the copied region a Single Entry region. +/* Duplicates a jump-thread path of N_REGION basic blocks. + The ENTRY edge is redirected to the duplicate of the region. Remove the last conditional statement in the last basic block in the REGION, and create a single fallthru edge pointing to the same destination as the @@ -2369,7 +2365,7 @@ verify_seme (edge entry, basic_block *region, unsigned n_region) Returns false if it is unable to copy the region, true otherwise. */ static bool -duplicate_seme_region (edge entry, edge exit, +duplicate_thread_path (edge entry, edge exit, basic_block *region, unsigned n_region, basic_block *region_copy) { @@ -2
Re: [PATCH, stage1] Better error recovery for merge-conflict markers
> The patch is implemented within libcpp: any such conflict markers were > typically injected by tools that work on raw lines of unpreprocessed > text, so it seemed fitting to do it there. > > The error can be suppressed with -fno-detect-conflict-markers for > the case where you're using the compiler just for the C preprocessor > and some of the input files legitimately contain such character > sequences. Could a new option be avoided (and maintain backwards compatibility) simply by not detecting this error when only preprocessing? Cheers, Manuel.
[debug-early] Handle specification of class scoped static functions
Hi Jason. For class scoped static functions: class C { public: static void moo () {} }; ...we are calling gen_subprogram on moo() the usual handful of times (during early dwarf): once as a member of C and once because moo() is a reachable function. However, on the second time, we reuse the cached DIE and merely remove the DW_AT_declaration attribute: /* Clear out the declaration attribute, but leave the parameters so they can be augmented with location information later. */ remove_AT (subr_die, DW_AT_declaration); This causes us to reuse the DIE from within the class, instead of adding a specification of this cached DIE. With the attached tweak we fix this problem, and about a hundred gdb regressions. This IMHO is a "Good Thing" :). Would you be so kind as to look at this two-liner to make sure you're OK with it? Thanks. Aldy commit cec08d43caffbf086720ac3994d068010dc103c9 Author: Aldy Hernandez Date: Fri Mar 20 09:55:31 2015 -0700 Handle specification of class scoped static functions. diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c index 8884afd..1325dfe 100644 --- a/gcc/dwarf2out.c +++ b/gcc/dwarf2out.c @@ -18770,7 +18770,20 @@ gen_subprogram_die (tree decl, dw_die_ref context_die) struct dwarf_file_data * file_index = lookup_filename (s.file); if (((is_cu_die (old_die->die_parent) || context_die == NULL - || dumped_early) + || (dumped_early + /* For class scoped static functions, the dumped early + version was the declaration, whereas the next time + around with a different context should be the + specification. In this case, avoid reusing the + DIE, but generate a specification below. E.g.: + + class C { + public: +static void moo () {} + }; + */ + && (!is_cu_die (context_die) + || !class_or_namespace_scope_p (old_die->die_parent && (DECL_ARTIFICIAL (decl) || (get_AT_file (old_die, DW_AT_decl_file) == file_index && (get_AT_unsigned (old_die, DW_AT_decl_line)
patch to fix PR64366
The following patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64366 The patch was bootstrapped on x86/x86-64, power64, and aarch64. Committed as rev.221541. 2015-03-20 Vladimir Makarov PR rtl-optimization/64366 * lra.c (lra_update_insn_regno_info): Consider regs in CALL_INSN_FUNCTION_USAGE memory. 2015-03-20 Vladimir Makarov PR rtl-optimization/64366 * gcc.target/sh/pr64366.c: New. Index: lra.c === --- lra.c (revision 221533) +++ lra.c (working copy) @@ -1633,7 +1633,8 @@ lra_update_insn_regno_info (rtx_insn *in lra_insn_recog_data_t data; struct lra_static_insn_data *static_data; enum rtx_code code; - + rtx link; + if (! INSN_P (insn)) return; data = lra_get_insn_recog_data (insn); @@ -1648,6 +1649,18 @@ lra_update_insn_regno_info (rtx_insn *in if ((code = GET_CODE (PATTERN (insn))) == CLOBBER || code == USE) add_regs_to_insn_regno_info (data, XEXP (PATTERN (insn), 0), uid, code == USE ? OP_IN : OP_OUT, false); + if (CALL_P (insn)) +/* On some targets call insns can refer to pseudos in memory in + CALL_INSN_FUNCTION_USAGE list. Process them in order to + consider their occurrences in calls for different + transformations (e.g. inheritance) with given pseudos. */ +for (link = CALL_INSN_FUNCTION_USAGE (insn); +link != NULL_RTX; +link = XEXP (link, 1)) + if (((code = GET_CODE (XEXP (link, 0))) == USE || code == CLOBBER) + && MEM_P (XEXP (XEXP (link, 0), 0))) + add_regs_to_insn_regno_info (data, XEXP (XEXP (link, 0), 0), uid, +code == USE ? OP_IN : OP_OUT, false); if (NONDEBUG_INSN_P (insn)) setup_insn_reg_info (data, freq); } Index: testsuite/gcc.target/sh/pr64366.c === --- testsuite/gcc.target/sh/pr64366.c (revision 0) +++ testsuite/gcc.target/sh/pr64366.c (working copy) @@ -0,0 +1,128 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -m4 -ml -mlra" } */ + +typedef int int8_t __attribute__ ((__mode__ (__QI__))); +typedef int int16_t __attribute__ ((__mode__ (__HI__))); +typedef int int32_t __attribute__ ((__mode__ (__SI__))); +typedef int int64_t __attribute__ ((__mode__ (__DI__))); +typedef unsigned char uint8_t; +typedef unsigned short int uint16_t; +typedef unsigned int uint32_t; +__extension__ typedef unsigned long long int uint64_t; +typedef int intptr_t; +typedef struct BigStruct{ + uint8_t a; + int8_t b; + uint16_t c; + int16_t d; + uint32_t e; + int32_t f; + uint64_t g; + int64_t h; + float i; + double j; + long double k; + char* l; + uint8_t m; + int8_t n; + uint16_t o; + int16_t p; + uint32_t q; + int32_t r; + uint64_t s; + int64_t t; + float u; + double v; + long double w; + char* x; + uint8_t y; + int8_t z; + uint16_t aa; + int16_t bb; + uint32_t cc; + int32_t dd; + uint64_t ee; + int64_t ff; + float gg; + double hh; + long double ii; + char* jj; + uint8_t kk; + int8_t ll; + uint16_t mm; + int16_t nn; + uint32_t oo; + int32_t pp; + uint64_t qq; + int64_t rr; + float ss; + double tt; + long double uu; + char* vv; + uint8_t ww; + int8_t xx; +} BigStruct; + +extern void foobar(); + +void +test_large_fn (uint8_t ui8_1, int8_t si8_1, uint16_t ui16_1, int16_t si16_1, + uint32_t ui32_1, int32_t si32_1, uint64_t ui64_1, int64_t si64_1, + float f_1, double d_1, long double ld_1, char* p_1, + uint8_t ui8_2, int8_t si8_2, uint16_t ui16_2, int16_t si16_2, + uint32_t ui32_2, int32_t si32_2, uint64_t ui64_2, int64_t si64_2, + float f_2, double d_2, long double ld_2, char* p_2, + uint8_t ui8_3, int8_t si8_3, uint16_t ui16_3, int16_t si16_3, + uint32_t ui32_3, int32_t si32_3, uint64_t ui64_3, int64_t si64_3, + float f_3, double d_3, long double ld_3, char* p_3, + uint8_t ui8_4, int8_t si8_4, uint16_t ui16_4, int16_t si16_4, + uint32_t ui32_4, int32_t si32_4, uint64_t ui64_4, int64_t si64_4, + float f_4, double d_4, long double ld_4, char* p_4, + uint8_t ui8_5, int8_t si8_5) +{ +BigStruct retVal = + { + ui8_1 + 1, si8_1 + 1, ui16_1 + 1, si16_1 + 1, + ui32_1 + 1, si32_1 + 1, ui64_1 + 1, si64_1 + 1, + f_1 + 1, d_1 + 1, ld_1 + 1, (char*)((intptr_t)p_1 + 1), + ui8_2 + 2, si8_2 + 2, ui16_2 + 2, si16_2 + 2, + ui32_2 + 2, si32_2 + 2, ui64_2 + 2, si64_2 + 2, + f_2 + 2, d_2 + 2, ld_2 + 2, (char*)((intptr_t)p_2 + 2), + ui8_3 + 3, si8_3 + 3, ui16_3 + 3, si16_3 + 3, + ui32_3 + 3, si32_3 + 3, ui64_3 + 3, si64_3 + 3, + f_3 + 3, d_3 + 3, ld_3 + 3, (char*)((intptr_t)p_3 + 3), + ui8_4 + 4, si8_4 + 4, ui16_4 + 4, si16_4 + 4, + ui32_4 + 4, si32_4 + 4, ui64_4 + 4, si64_4 + 4
Re: [PATCH] Fix thunk expansion (PR ipa/64896)
> > gcc/ > > 2015-03-11 Yvan Roux > > > > Backport from trunk r216841. > > 2014-10-29 Martin Liska > > > > PR ipa/63587 > > * cgraphunit.c (cgraph_node::expand_thunk): Only VAR_DECLs are put > > to local declarations. > > * function.c (add_local_decl): Implementation moved from header > > file, assert introduced for tree type. > > * function.h: Likewise. > > > > gcc/testsuite/ > > 2015-03-11 Yvan Roux > > > > Backport from trunk r216841. > > 2014-10-29 Martin Liska > > > > PR ipa/63587 > > * g++.dg/ipa/pr63587-1.C: New test. > > * g++.dg/ipa/pr63587-2.C: New test. > > > > - PR 64813 - > > 2015-03-11 Yvan Roux > > > > Backport from trunk r220616. > > 2015-02-11 Martin Liska > > > > PR ipa/64813 > > * cgraphunit.c (cgraph_node::expand_thunk): Do not create > > a return value for call to a function that is noreturn. OK, thanks! Honza
Re: [PATCH, stage1] Better error recovery for merge-conflict markers
On Fri, 20 Mar 2015, David Malcolm wrote: > I believe that the presense of these markers in source code is almost > always a bug (are there any GCC frontends in which the markers are > parsable as something valid?) Well, obviously they are valid inside #if 0, strings (where you have a test, though not one at start of line "\ <<<") and comments (where you don't have a test). They are also valid when stringized: #define str(s) #s const char *s = str( <<< ); must be accepted. They are also valid in the expansion of a macro that doesn't get expanded. #define foo \ <<< That is, in general, the invalidity only occurs when preprocessing tokens are converted to tokens. In C++ (C++11 and later), >>> can also close a sequence of nested template argument lists, thanks to the rule about replacing >> by > > in that context. And of course it's OK, if odd, to put that at the start of a line. So in that case the preprocessing tokens do get converted to tokens, and that token sequence (interpreted as >> >> >> > and then contextually adjusted to > > > > > > >) is valid. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] Speed-up def_builtin_const (ix86_valid_target_attribute)
> gcc/ChangeLog: > > 2015-03-20 Martin Liska > Jakub Jelinek > > * config/i386/i386.c (def_builtin): Set deferred_isa_values for > masks that can potentially include a builtin. > (ix86_add_new_builtins): Introduce fast filter for isa values > that cannot trigger builtin inclusion. Thanks, this looks better. Patch is OK if it passes testing. Honza > --- > gcc/config/i386/i386.c | 11 +++ > 1 file changed, 11 insertions(+) > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > index 47deda7..82a4848 100644 > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -30588,6 +30588,8 @@ struct builtin_isa { > > static struct builtin_isa ix86_builtins_isa[(int) IX86_BUILTIN_MAX]; > > +/* Bits that can still enable any inclusion of a builtin. */ > +static HOST_WIDE_INT deferred_isa_values = 0; > > /* Add an ix86 target builtin function with CODE, NAME and TYPE. Save the > MASK > of which isa_flags to use in the ix86_builtins_isa array. Stores the > @@ -30631,6 +30633,9 @@ def_builtin (HOST_WIDE_INT mask, const char *name, > } >else > { > + /* Just a MASK where set_and_not_built_p == true can potentially > + include a builtin. */ > + deferred_isa_values |= mask; > ix86_builtins[(int) code] = NULL_TREE; > ix86_builtins_isa[(int) code].tcode = tcode; > ix86_builtins_isa[(int) code].name = name; > @@ -30666,6 +30671,12 @@ def_builtin_const (HOST_WIDE_INT mask, const char > *name, > static void > ix86_add_new_builtins (HOST_WIDE_INT isa) > { > + if ((isa & deferred_isa_values) == 0) > +return; > + > + /* Bits in ISA value can be removed from potential isa values. */ > + deferred_isa_values &= ~isa; > + >int i; >tree saved_current_target_pragma = current_target_pragma; >current_target_pragma = NULL_TREE; > -- > 2.1.2 >
[debug-early] disallow type DIEs in limbo
Somewhere in the hurricane of debug-early patches of the past month, we fixed this and forgot verify it was still an issue. Types are no longer appearing in limbo in late dwarf, so there is no reason to special case this. Tested with the guality.exp framework and the gdb testsuite. Committed to branch. Aldy commit 703f3b342cd2d1aa7ec3af45881e08e5eea6942a Author: Aldy Hernandez Date: Fri Mar 20 10:42:05 2015 -0700 Disallow limbo type DIEs in late dwarf generation. diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c index 48bd5b8..8884afd 100644 --- a/gcc/dwarf2out.c +++ b/gcc/dwarf2out.c @@ -4950,10 +4950,6 @@ new_die (enum dwarf_tag tag_value, dw_die_ref parent_die, tree t) them up. */ && (TREE_CODE (t) != FUNCTION_DECL || !decl_function_context (t)) - /* FIXME: Allow types for now. We are getting some internal -template types from inlining (building libstdc++). -Templates need to be looked at. */ - && !TYPE_P (t) /* FIXME: Allow late limbo DIE creation for LTO, especially in the ltrans stage, but once we implement LTO dwarf streaming, we should remove this exception. */
Re: [PATCH] PR target/65240, Fix Power{7,8} insn constraint issue with -O3 -ffast-math
On Wed, Mar 18, 2015 at 12:08:47PM +0100, Richard Biener wrote: > Did you double-check if there are any differences in generated code? > Esp. the SPEC INT benchmarks look odd - they don't contain any > FP code. SpecINT does contain some amount of floating point. Off the top of my head: 1) Bzip2 does a percentage calculation for fprintf; 2) Libquantum uses sin/cos (which the compiler optimizes to sincos) in the initial setup to set up the data. 3) I don't recall if the version of GCC used in Spec had switched over to using floating point for the register allocator or not. 4) Perlbench has a lot of code that does floating pointing point, but the main loop excerised in the Spec runs probably doesn't use FP. Sorry I couldn't respond earlier, the corporate IMAP email server was down for a period of time. Any way, I do see code changes. Before the fix was made, if you used -ffast-math, it kept the floating point constants around until reload. When reload could not find a reload to load the constant, it would push the constant to memory, and do validize_mem on the address. This created the sequence: addis 9,2,.LC0@toc@ha addi 9,9,.LC0@toc@l lfd 0,0(9) Because the address was a single register, it could also load the value into the traditional Altivec registers: addis 9,2,.LC0@toc@ha addi 9,9,.LC0@toc@l lxsdx 32,0(9) And in fact the register allocator seemed to prefer loading constants into the traditional Altivec registers instead of the traditional floating point registers. Once I made the change to force the constant to memory earlier, it would use normal addressing, and generate something like: addis 9,2,.LC0@toc@ha lfs 0,.LC0@toc@l(9) This meant that a lot of addi's are no longer generated, because the addi is folded into the lfs/lfd instruction. In addition, due to the VSX memory instructions being only register+register, whenever the code wanted to load floating point constants, it would prefer the traditional floating point registers which had register+offset addressing. This meant in turn, that any instruction that used a FP constant could potentially be changed from a VSX form (i.e. xsmuldp) into a traditional FP form (i.e. fmul) if all of the inputs and outputs were traditional floating point registers. Finally, I suspect pushing the address out earlier, means that it might be keeping the address live a little bit longer, which could change things if we are spilling GPRs. One of the things I am working on for GCC 6.0 is going back to reworking on the address support to do a better job with fusion, and I believe it will reduce the life time for some of these address temporaries. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Re: [PATCH] Fix __has_{cpp_}attribute with -traditional-cpp (PR preprocessor/65238)
Jason Merrill writes: > On 03/20/2015 12:48 PM, Jakub Jelinek wrote: >> On Fri, Mar 20, 2015 at 12:30:44PM -0400, Jason Merrill wrote: >>> On 03/11/2015 03:10 PM, Jakub Jelinek wrote: __has_{cpp_,}attribute builtin macros are effectively function-like macros taking one argument (and the ISO preprocessor expands macros in the argument which is IMHO desirable), but the traditional preprocessor has been crashing on them or reporting errors. >>> >>> Why do we want ISO preprocessor behavior in this specific situation? >> >> You mean that we would handle >> #define U unused >> #if __has_attribute(U) >> int u __attribute__((unused)); >> #endif >> differently between ISO and traditional preprocessing? > >> That would be surprising to users. > > Why surprising? Don't users of the traditional preprocessor expect > traditional preprocessor behavior? One of the reasons why I thought it'd be "nice" to have the traditionnal mode support the macro-expansion of the arguments here is that there already are cases where the traditionnal mode supports ISO behaviour. For instance, the documentation of cpp says: 10.3 Traditional miscellany === Here are some things to be aware of when using the traditional preprocessor. [...] * A true traditional C preprocessor does not recognize '#error' or '#pragma', and may not recognize '#elif'. CPP supports all the directives in traditional mode that it supports in ISO mode, including extensions, with the exception that the effects of '#pragma GCC poison' are undefined. So I thought this useful particular use case of __has_attribute(U) might well be another of such case even if it's not a directive. Just my 2 cents. -- Dodji
Re: [PATCH] Fix __has_{cpp_}attribute with -traditional-cpp (PR preprocessor/65238)
On Fri, Mar 20, 2015 at 01:15:52PM -0400, Jason Merrill wrote: > On 03/20/2015 12:48 PM, Jakub Jelinek wrote: > >On Fri, Mar 20, 2015 at 12:30:44PM -0400, Jason Merrill wrote: > >>On 03/11/2015 03:10 PM, Jakub Jelinek wrote: > >>>__has_{cpp_,}attribute builtin macros are effectively function-like macros > >>>taking one argument (and the ISO preprocessor expands macros in the > >>>argument > >>>which is IMHO desirable), but the traditional preprocessor has been > >>>crashing > >>>on them or reporting errors. > >> > >>Why do we want ISO preprocessor behavior in this specific situation? > > > >You mean that we would handle > >#define U unused > >#if __has_attribute(U) > >int u __attribute__((unused)); > >#endif > >differently between ISO and traditional preprocessing? > > >That would be surprising to users. > > Why surprising? Don't users of the traditional preprocessor expect > traditional preprocessor behavior? Well, but the traditional preprocessor behavior in the end is not in not expanding macro arguments, normally they actually are expanded, but not immediately, but because of trying to preprocess the result of preprocessing again and again. When traditionally preprocessing #if/#elif etc. directives, the directive line is preprocessed until no macros are present and only then fed to the ISO preprocessor to evaluate and handle the directive. The problem with the builtin function-like macros is that it needs the argument immediately. Jakub
Re: [PATCH] Fix __has_{cpp_}attribute with -traditional-cpp (PR preprocessor/65238)
On 03/20/2015 12:48 PM, Jakub Jelinek wrote: On Fri, Mar 20, 2015 at 12:30:44PM -0400, Jason Merrill wrote: On 03/11/2015 03:10 PM, Jakub Jelinek wrote: __has_{cpp_,}attribute builtin macros are effectively function-like macros taking one argument (and the ISO preprocessor expands macros in the argument which is IMHO desirable), but the traditional preprocessor has been crashing on them or reporting errors. Why do we want ISO preprocessor behavior in this specific situation? You mean that we would handle #define U unused #if __has_attribute(U) int u __attribute__((unused)); #endif differently between ISO and traditional preprocessing? That would be surprising to users. Why surprising? Don't users of the traditional preprocessor expect traditional preprocessor behavior? Jason
Re: [Patch Testsuite] Make all_attributes.cc in to (almost_)all_attributes.cc for ARM.
On 20/03/15 17:13 +, James Greenhalgh wrote: This patch just disables the check in this test case for "unused" when testing for ARM, which resolves the issue. Tested on arm-none-linux-gnueabihf to confirm it clears the FAIL. OK, thanks.
[Patch Testsuite] Make all_attributes.cc in to (almost_)all_attributes.cc for ARM.
Hi, As discussed on IRC, glibc's sysdeps/unix/sysv/linux/arm/sys/ucontext.h for ARM has a field named "unused", in a struct which looks like this: struct _libc_fpstate { struct { unsigned int sign1:1; unsigned int unused:15; unsigned int sign2:1; unsigned int exponent:14; unsigned int j:1; unsigned int mantissa1:31; unsigned int mantissa0:32; } fpregs[8]; unsigned int fpsr:32; unsigned int fpcr:32; unsigned char ftype[8]; unsigned int init_flag; }; The all_attributes testcase #defines unused to '1', giving unsigned int 1:15; which is not going to work. This patch just disables the check in this test case for "unused" when testing for ARM, which resolves the issue. Tested on arm-none-linux-gnueabihf to confirm it clears the FAIL. OK? Thanks, James --- 2015-03-20 James Greenhalgh * testsuite/17_intro/headers/c++1998/all_attributes.cc: Disable test for unused for ARM. * testsuite/17_intro/headers/c++200x/all_attributes.cc: Likewise. * testsuite/17_intro/headers/c++2014/all_attributes.cc: Likewise. diff --git a/libstdc++-v3/testsuite/17_intro/headers/c++1998/all_attributes.cc b/libstdc++-v3/testsuite/17_intro/headers/c++1998/all_attributes.cc index 6fc362a..7bc7ffe 100644 --- a/libstdc++-v3/testsuite/17_intro/headers/c++1998/all_attributes.cc +++ b/libstdc++-v3/testsuite/17_intro/headers/c++1998/all_attributes.cc @@ -30,7 +30,10 @@ #endif #define packed 1 #define pure 1 +// glibc's sysdeps/unix/sysv/linux/arm/sys/ucontext.h uses this on ARM. +#ifndef __arm__ #define unused 1 +#endif #include // TODO: this is missing from #include diff --git a/libstdc++-v3/testsuite/17_intro/headers/c++200x/all_attributes.cc b/libstdc++-v3/testsuite/17_intro/headers/c++200x/all_attributes.cc index 0726e3f..8d93fd9 100644 --- a/libstdc++-v3/testsuite/17_intro/headers/c++200x/all_attributes.cc +++ b/libstdc++-v3/testsuite/17_intro/headers/c++200x/all_attributes.cc @@ -29,7 +29,10 @@ #endif #define packed 1 #define pure 1 +// glibc's sysdeps/unix/sysv/linux/arm/sys/ucontext.h uses this on ARM. +#ifndef __arm__ #define unused 1 +#endif #include // TODO: this is missing from #include// TODO: this is missing from diff --git a/libstdc++-v3/testsuite/17_intro/headers/c++2014/all_attributes.cc b/libstdc++-v3/testsuite/17_intro/headers/c++2014/all_attributes.cc index 06bcb8e..c482fbd 100644 --- a/libstdc++-v3/testsuite/17_intro/headers/c++2014/all_attributes.cc +++ b/libstdc++-v3/testsuite/17_intro/headers/c++2014/all_attributes.cc @@ -29,7 +29,10 @@ #endif #define packed 1 #define pure 1 +// glibc's sysdeps/unix/sysv/linux/arm/sys/ucontext.h uses this on ARM. +#ifndef __arm__ #define unused 1 +#endif #include // TODO: this is missing from #include // TODO: this is missing from
Re: [C PATCH] Fix up file-scope _Atomic expansion (PR c/65345)
On Thu, 12 Mar 2015, Marek Polacek wrote: > 2015-03-12 Marek Polacek > > PR c/65345 > * c-decl.c (set_labels_context_r): New function. > (store_parm_decls): Call it via walk_tree_without_duplicates. > * c-typeck.c (convert_lvalue_to_rvalue): Use create_tmp_var_raw > instead of create_tmp_var. Build TARGET_EXPR instead of > COMPOUND_EXPR. > (build_atomic_assign): Use create_tmp_var_raw instead of > create_tmp_var. Build TARGET_EXPRs instead of MODIFY_EXPR. > > * gcc.dg/pr65345-1.c: New test. > * gcc.dg/pr65345-2.c: New test. OK for stage 1, but I think you may need a further patch that fixes the TARGET_ATOMIC_ASSIGN_EXPAND_FENV implementations to use create_tmp_var_raw and TARGET_EXPR as well, along with adding tests for floating-point compound assignment / increment / decrement in the problem contexts. A backport to GCC 5 branch could be considered after a while on trunk. -- Joseph S. Myers jos...@codesourcery.com
Re: [Patch, Fortran, PR 64787 a.o., v2] Invalid code on sourced allocation of class(*) character string
Dear Andre, This patch fixes the issues I have reported in my previous mail. In addition it allows the Metcalf’s oo.f90 to compiles and gives sensible results at run time (yes, I have noticed the patch to fix the memory leak, and it works). I have a two comments: The list of the (partially) fixed PRs should appear in the Changelog (IIRC no gcc before fortran). I think the English in the comments should be checked. Although I can probably fix the most obvious offenders, English is not my mother language and I have been a terrible pupil (in the E- range) when I was supposed to learn the grammar! Thanks for working on these issues. Cheers, Dominique > Le 19 mars 2015 à 14:03, Andre Vehreschild a écrit : > > Hi Dominique, Hi all, > > please find attached a new version of the patch to fix pr64787 after > processing > Dominique's comments. Thank you very much for your work, Dominique. > > The patch now also fixes: > pr63230 - allocation of deferred length character as derived type component > causes internal compiler error > pr51550 - ICE in gfc_get_derived_type, at fortran/trans-types.c:2401 (I > believe > the fortran code in the pr is not legal and fixed it; the fixed one now runs.) > > It partially fixes: > pr55901 - [OOP] type is (character(len=*)) misinterpreted as array > (The codes compile and run, but valgrind reports accesses to uninitialized > memory; I am looking into this.) > pr54070 - [4.8/4.9/5 Regression] Wrong code with allocatable deferred-length > (array) function results > (Compiles again (didn't with the first version of the patch for 64787), but > still segfaults at runtime; -> agenda) > > This patch needs my previous patches as stated in: > https://gcc.gnu.org/ml/fortran/2015-03/msg00076.html > > Bootstraps and regtests ok on x86_64-linux-gnu/F20. > > Reviews and comments welcome. > > Regards, > Andre > -- > Andre Vehreschild * Email: vehre ad gmx dot de
[PATCH] Enable two UNSIGNED_FLOAT simplifications in simplify_unary_operation_1
Hi all, This is a simple patch to enable two simplifications for UNSIGNED_FLOAT expression. For the following rtx patterns, they can be simplified when the integer x can be represented in float mode without precision loss: float_truncate (float x) --> float x float_extend (float x) --> float x Those two simplifications are also applicable to UNSIGNED_FLOAT expression. For example, compile the following code using aarch64-none-elf toolchain with -O1 flag. double f1 (uint16_t x) { return (double)(float)x; } Before the change, the compiler generates the following code: f1: uxthw0, w0 ucvtf s0, w0 fcvtd0, s0 ret After the change, the following simplified asm code snipts are generated. f1: uxthw0, w0 ucvtf d0, w0 ret aarch64-none-elf regression test runs Okay. x86_64 bootstraps Okay. Okay to commit? gcc/ChangeLog: 2015-03-20 Renlin Li * simplify-rtx.c (simplify_unary_operation_1): Fix a typo. Enable two simplifications for UNSIGNED_FLOAT. gcc/testsuite/ChangeLog: 2015-03-20 Renlin Li * gcc.target/aarch64/unsigned-float.c: New. * gcc.target/arm/unsigned-float.c: New.diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c index 5d17498..4b18d3c 100644 --- a/gcc/simplify-rtx.c +++ b/gcc/simplify-rtx.c @@ -1171,7 +1171,7 @@ simplify_unary_operation_1 (enum rtx_code code, machine_mode mode, rtx op) = (float_truncate:SF foo:DF). (float_truncate:DF (float_extend:XF foo:SF)) - = (float_extend:SF foo:DF). */ + = (float_extend:DF foo:SF). */ if ((GET_CODE (op) == FLOAT_TRUNCATE && flag_unsafe_math_optimizations) || GET_CODE (op) == FLOAT_EXTEND) @@ -1183,14 +1183,14 @@ simplify_unary_operation_1 (enum rtx_code code, machine_mode mode, rtx op) XEXP (op, 0), mode); /* (float_truncate (float x)) is (float x) */ - if (GET_CODE (op) == FLOAT + if ((GET_CODE (op) == FLOAT || GET_CODE (op) == UNSIGNED_FLOAT) && (flag_unsafe_math_optimizations || (SCALAR_FLOAT_MODE_P (GET_MODE (op)) && ((unsigned)significand_size (GET_MODE (op)) >= (GET_MODE_PRECISION (GET_MODE (XEXP (op, 0))) - num_sign_bit_copies (XEXP (op, 0), GET_MODE (XEXP (op, 0 - return simplify_gen_unary (FLOAT, mode, + return simplify_gen_unary (GET_CODE (op), mode, XEXP (op, 0), GET_MODE (XEXP (op, 0))); @@ -1221,7 +1221,7 @@ simplify_unary_operation_1 (enum rtx_code code, machine_mode mode, rtx op) rounding can't happen. */ if (GET_CODE (op) == FLOAT_EXTEND - || (GET_CODE (op) == FLOAT + || ((GET_CODE (op) == FLOAT || GET_CODE (op) == UNSIGNED_FLOAT) && SCALAR_FLOAT_MODE_P (GET_MODE (op)) && ((unsigned)significand_size (GET_MODE (op)) >= (GET_MODE_PRECISION (GET_MODE (XEXP (op, 0))) diff --git a/gcc/testsuite/gcc.target/aarch64/unsigned-float.c b/gcc/testsuite/gcc.target/aarch64/unsigned-float.c new file mode 100644 index 000..c5ad680 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/unsigned-float.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O1" } */ + +#include + +double +f1 (uint16_t x) +{ + return (double)(float)x; +} + +float +f2 (uint16_t x) +{ + return (float)(double)x; +} + +/* { dg-final { scan-assembler-not "fcvt" } } */ diff --git a/gcc/testsuite/gcc.target/arm/unsigned-float.c b/gcc/testsuite/gcc.target/arm/unsigned-float.c new file mode 100644 index 000..bb05c85 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/unsigned-float.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_vfp_ok } */ +/* { dg-options "-march=armv7-a -O1 -mfloat-abi=softfp" } */ +/* { dg-skip-if "need fp instructions" { *-*-* } { "-mfloat-abi=soft" } { "" } } */ + +#include + +double +f1 (uint16_t x) +{ + return (double)(float)x; +} + +float +f2 (uint16_t x) +{ + return (float)(double)x; +} + +/* { dg-final { scan-assembler-not "vcvt.(f32.f64|f64.f32)" } } */
[debug-early] minor cleanups
Just cleaning up some left over or non-applicable comments. Committed to branch. commit b783ed568b4b56d42f4203ce32d964104be6a368 Author: Aldy Hernandez Date: Fri Mar 20 09:49:23 2015 -0700 Minor cleanups. diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c index c538d2e..c4ffcf2 100644 --- a/gcc/c/c-decl.c +++ b/gcc/c/c-decl.c @@ -10675,13 +10675,8 @@ c_write_global_declarations_1 (tree globals) } while (reconsider); - /* ?? For completeness, we could stop the TV_PHASE_DEFERRED timer - here, and start the TV_PHASE_DBGINFO timer. Is it worth it, or - would it convolute things? */ for (decl = globals; decl; decl = DECL_CHAIN (decl)) check_global_declaration_1 (decl); - /* ?? Similarly here. Stop TV_PHASE_DBGINFO and start - TV_PHASE_DEFERRED again. */ } /* Callback to collect a source_ref from a DECL. */ diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c index 76fd70b..48bd5b8 100644 --- a/gcc/dwarf2out.c +++ b/gcc/dwarf2out.c @@ -18040,9 +18040,7 @@ gen_formal_parameter_die (tree node, tree origin, bool emit_name_p, { /* Check that parm_die already has the right attributes that we would have added below. If any attributes are -missing, fall through to add them. - -?? Add more checks here. */ +missing, fall through to add them. */ if (! DECL_ABSTRACT_P (node_or_origin) && !get_AT (parm_die, DW_AT_location) && !get_AT (parm_die, DW_AT_const_value)) diff --git a/gcc/go/go-gcc.cc b/gcc/go/go-gcc.cc index 42cced4..7965b11 100644 --- a/gcc/go/go-gcc.cc +++ b/gcc/go/go-gcc.cc @@ -3084,12 +3084,6 @@ Gcc_backend::write_global_definitions( wrapup_global_declarations(defs, i); - /* ?? Can we leave this call here, thus getting called before - finalize_compilation_unit? - - Originally this was called *AFTER* finalize_compilation_unit. If - `go' really needs this call after finalize_compilation_unit, we - can use LANG_HOOKS_POST_COMPILATION_PARSING_CLEANUPS (yuck). */ check_global_declarations(defs, i); delete[] defs; diff --git a/gcc/tree.c b/gcc/tree.c index 06d30e8..97b9170 100644 --- a/gcc/tree.c +++ b/gcc/tree.c @@ -5257,8 +5257,6 @@ free_lang_data_in_decl (tree decl) At this point, it is not needed anymore. */ DECL_SAVED_TREE (decl) = NULL_TREE; - /* ?? This should be OK to remove now that we are generating dwarf -early. */ /* Clear the abstract origin if it refers to a method. Otherwise dwarf2out.c will ICE as we clear TYPE_METHODS and thus the origin will not be output correctly. */
Re: [PATCH] Fix __has_{cpp_}attribute with -traditional-cpp (PR preprocessor/65238)
On Fri, Mar 20, 2015 at 12:30:44PM -0400, Jason Merrill wrote: > On 03/11/2015 03:10 PM, Jakub Jelinek wrote: > >__has_{cpp_,}attribute builtin macros are effectively function-like macros > >taking one argument (and the ISO preprocessor expands macros in the argument > >which is IMHO desirable), but the traditional preprocessor has been crashing > >on them or reporting errors. > > Why do we want ISO preprocessor behavior in this specific situation? You mean that we would handle #define U unused #if __has_attribute(U) int u __attribute__((unused)); #endif differently between ISO and traditional preprocessing? That would be surprising to users. IMHO either we want to expand the arguments in both cases (what the patch does), or in none (that would be then consistent with clang++, guess would mean adding pfile->state.prevent_expansion++; / pfile->state.prevent_expansion--; pair around something in the ISO case, and would slightly but not too much simplify the traditional __has_attribute handling; still we'd need to build the buffer with the argument and feed it to the langhook, which parses it with ISO preprocessor with disabled expansion). Jakub
Re: [PATCH] Fix __has_{cpp_}attribute with -traditional-cpp (PR preprocessor/65238)
On 03/11/2015 03:10 PM, Jakub Jelinek wrote: __has_{cpp_,}attribute builtin macros are effectively function-like macros taking one argument (and the ISO preprocessor expands macros in the argument which is IMHO desirable), but the traditional preprocessor has been crashing on them or reporting errors. Why do we want ISO preprocessor behavior in this specific situation? Jason
[PATCH][expmed][trivial] Fix comment about multiplying by T-1 and adding T
Hi all, This comment in expmed.c talks about multiplying by a value that ends in ...01 by doing the T-1 multiplication first and then adding T to get the final result. But the typo says '1' instead of 'T'. Similarly with the T+1, -T case. I'll apply this patch on Monday unless someone objects. Thanks, Kyrill 2015-03-19 Kyrylo Tkachov * expmed.c (synth_mult): Fix comment about multiplying by T-1 and adding T or multiplying by T+1 and subracting T.diff --git a/gcc/expmed.c b/gcc/expmed.c index d2b2534..ffbf462 100644 --- a/gcc/expmed.c +++ b/gcc/expmed.c @@ -2613,7 +2613,7 @@ synth_mult (struct algorithm *alg_out, unsigned HOST_WIDE_INT t, Thus we prefer addition in that case. */ && t != 3)) { - /* T ends with ...111. Multiply by (T + 1) and subtract 1. */ + /* T ends with ...111. Multiply by (T + 1) and subtract T. */ op_cost = add_cost (speed, mode); new_limit.cost = best_cost.cost - op_cost; @@ -2633,7 +2633,7 @@ synth_mult (struct algorithm *alg_out, unsigned HOST_WIDE_INT t, } else { - /* T ends with ...01 or ...011. Multiply by (T - 1) and add 1. */ + /* T ends with ...01 or ...011. Multiply by (T - 1) and add T. */ op_cost = add_cost (speed, mode); new_limit.cost = best_cost.cost - op_cost;
[PATCH, stage1] Better error recovery for merge-conflict markers
Various tools that operate on source code files will inject markers into them when an unfixable conflict occurs in a merger. There appears to be no blessed standard for these conflict markers, but an ad-hoc convention is for 7 '<' , '=', or '>' characters at the start of a line, followed optionally by a space and optional text e.g.: <<< HEAD extern int some_var; === extern short some_var; >>> Some other branch This convention is followed by GNU patch: http://git.savannah.gnu.org/cgit/patch.git/tree/src/merge.c by git: http://git.kernel.org/cgit/git/git.git/tree/Documentation/merge-config.txt and by various other tools. I believe that the presense of these markers in source code is almost always a bug (are there any GCC frontends in which the markers are parsable as something valid?) Currently the above leads to cryptic error messages from GCC: foo.c:1:1: error: expected identifier or ‘(’ before ‘<<’ token <<< HEAD ^ foo.c:3:1: error: expected identifier or ‘(’ before ‘==’ token === ^ foo.c:5:1: error: expected identifier or ‘(’ before ‘>>’ token >>> Some other branch ^ This is typically followed by a cascade of other error messages. The attached patch implements detection of such patch conflict markers, issuing errors about each one that it encounters: foo.c:1:1: error: source file contains patch conflict marker <<< HEAD ^ foo.c:3:1: error: source file contains patch conflict marker === ^ foo.c:5:1: error: source file contains patch conflict marker >>> Some other branch ^ It attempts to continue the failure more gracefully by only parsing the first hunk of the change and ignoring the second (I arbitrarily picked the top hunk; I don't know if there's a better way to deal with this, but it's already a failure mode). It also attempts to handle nested conflicts. This nesting works somewhat analogously to a: #error ...warn about the conflict marker #if 1 /* accept input at <<< */ #else /* start suppressing input at === */ #endif /* finish suppressing input at >>> */ but without imposing any assumptions that the conflict markers are balanced. The patch is implemented within libcpp: any such conflict markers were typically injected by tools that work on raw lines of unpreprocessed text, so it seemed fitting to do it there. The error can be suppressed with -fno-detect-conflict-markers for the case where you're using the compiler just for the C preprocessor and some of the input files legitimately contain such character sequences. Successfully bootstrapped®rtested on x86_64-unknown-linux-gnu (Fedora 20), with 25 new "PASS" results in gcc.sum (relative to a control build of r221492), for the new test cases. OK for next stage1? gcc/c-family/ChangeLog: * c-opts.c (sanitize_cpp_opts): Set up cpp_opts detect_conflict_markers based on flag_detect_conflict_markers. gcc/ChangeLog: * common.opt (fdetect-conflict-markers): New option. gcc/testsuite/ChangeLog: * gcc.dg/patch-conflict-markers-1.c: New test case. * gcc.dg/patch-conflict-markers-2.c: Likewise. * gcc.dg/patch-conflict-markers-3.c: Likewise. * gcc.dg/patch-conflict-markers-4.c: Likewise. * gcc.dg/patch-conflict-markers-5.c: Likewise. * gcc.dg/patch-conflict-markers-6.c: Likewise. * gcc.dg/patch-conflict-markers-7.c: Likewise. libcpp/ChangeLog: * Makefile.in (libcpp_a_OBJS): Add conflict-markers.o. (libcpp_a_SOURCES): Add conflict-markers.c. * conflict-markers.c: New file. * include/cpplib.h (TTYPE_TABLE): Add CONFLICT_MARKER_BEGIN, CONFLICT_MARKER_MIDDLE, CONFLICT_MARKER_END. (struct cpp_options): Add field "detect_conflict_markers". * init.c (cpp_create_reader): Initialize detect_conflict_markers field. * internal.h (struct lexer_state): Add field "on_conflict_marker_line". (struct cpp_buffer): Add field "cm_stack". (_cpp_report_conflict_marker): New function decl. (_cpp_handle_conflict_marker): New function decl. * lex.c (_cpp_lex_token): When encountering a conflict marker token, call _cpp_handle_conflict_marker and skip it. (_cpp_get_fresh_line): Update logic for injecting CPP_EOF to also do it when handling conflict markers. (_cpp_lex_direct): Update comment to reflect above change to _cpp_get_fresh_line. Update handling of '<', '=' and '>' to implement detection of conflict markers. --- gcc/c-family/c-opts.c | 2 + gcc/common.opt | 4 + gcc/testsuite/gcc.dg/patch-conflict-markers-1.c | 10 +++ gcc/testsuite/gcc.dg/patch-conflict-markers-2.c | 2 + gcc/testsuite/gcc.dg/patch-conflict-markers-3.c | 20 + gcc/testsuite/gcc.dg/patch-conflict-markers-4.c | 11 +++ gcc/testsuite/gcc.dg/patch-conflict-markers-5.c | 11 +++ gcc/testsuite/gcc.dg/patch-conflict-marke
Re: C++ PATCH for c++/65398 (valid constexpr rejected)
On Fri, Mar 20, 2015 at 11:15:30AM -0400, Jason Merrill wrote: > On 03/20/2015 11:11 AM, Jakub Jelinek wrote: > >>&s[3] is the address of the terminal \0. > > > >Yeah, sure. But the above testcase does &s[4], which is out of bounds > >arithmetics > > But since &s[3] is the address of an element of the array (the NUL), &s[4] > is one-past-the-end, which is fine. &s[5] would be out of bounds. Ashamed... Obviously I meant if we should diagnose *(&s[5] - 2). clang++ does. > Agreed. And I still think it makes sense to fix this as part of this patch. Yes. Jakub
Re: [GOOGLE] Fixes for TLS wrapper and init functions in LIPO mode
On Thu, Mar 19, 2015 at 10:38 PM, Xinliang David Li wrote: > On Thu, Mar 19, 2015 at 9:57 PM, Teresa Johnson wrote: >> On Thu, Mar 19, 2015 at 8:00 PM, Xinliang David Li >> wrote: >>> ok -- then there is an over assertion in get_local_tls_init_fn. The >>> method can be called for aux module -- the decl created will also be >>> promoted. >> >> No, it won't ever be called for an aux module. Both callsites are >> guarded (handle_tls_init has an early return at the top, and the >> callsite is guarded in get_tls_iniit_fn). >> >>> >>> Also can we rely on late promotion (special case of artificial >>> function __tls_init? This can avoid duplicated logic here. >> >> We don't need to promote __tls_init, but rather the init functions for >> each TLS variable (which are each aliased to __tls_init). > >> I thought >> about both approaches, but promoting it as it was created seemed more >> straightforward. I can give that approach a try though if you prefer >> it (special casing DECL_ARTIFICIAL functions that start with _ZTH). >> Incidentally they are not marked static, so that would also need to be >> done to get them promoted. > > For public tls symbols in aux module, tls wrappers will reference > _ZTHxx init functions which are aliases to __tls_init. If __tls_init > is not created for aux modules, what symbol will those _ZTHxx > funcitons aliased to? > > Is it better just let _tls_init function to be created as usual, but > let the standard promotion kick in later? The ZTHxx functions can > still be marked as alias to the promoted __tls_init? So there are 3 function decls being generated for TLS variables: 1) A single __tls_init function per module This initializes *all* TLS variables in the module, guarded by a single __tls_guard variable to ensure it is only done once. 2) One _ZTHxx init function per TLS variable Each of these is aliased to the module's __tls_init 3) One _ZTWxx wrapper function per TLS variable This is called in place of accesses to that TLS variable. It invokes the _ZTHxx init function for the variable to initialize it if necessary. I am concerned about generating the aux module __tls_init functions and then promoting them as they contain initializations for all TLS variables from the aux module - generating it each time the module is included as an aux module seems inefficient at best. What happens in the case where the TLS variable is declared extern and accessed by a separate module (and essentially is what happens with my patch when it is file static, promoted and accessed via an aux module) is this: The _ZTHxx init functions are aliased to __tls_init in the module containing the definition, and are externally-visible (after promoting in the file-static exported primary module case). The external module containing the reference (or the primary module that imported the defining module as an aux module in the LIPO case) generates a _ZTWxx wrapper for the TLS variable, which references its _ZTHxx init function. This ends up resolved to the exported/promoted _ZTHxx symbol in the defining module, which in turn is aliased to the defining module's __tls_init function. Teresa > > David > > > > >> >> Teresa >> >>> >>> David >>> >>> On Thu, Mar 19, 2015 at 5:51 PM, Teresa Johnson >>> wrote: On Thu, Mar 19, 2015 at 4:45 PM, Xinliang David Li wrote: > does generate_tls_wrapper also need to be suppressed for aux module? No, we generate the wrapper in the aux module and use it to access the TLS variable and invoke it's init function (which is defined in the variable's own module). Presumably we could generate that only in the original module and promote it if necessary, but generating it in the aux module does allow it to be inlined. This is essentially how the TLS accesses are handled for extern TLS variables defined elsewhere (wrapper generated in referring module). Teresa > > David > > On Thu, Mar 19, 2015 at 4:11 PM, Teresa Johnson > wrote: >> New patch below. Passes regression tests plus internal application build. >> >> 2015-03-19 Teresa Johnson >> >> gcc/ >> Google ref b/19618364. >> * cp/decl2.c (get_local_tls_init_fn): Assert on aux module. >> (get_tls_init_fn): Promote non-public init functions if necessary >> in LIPO mode, record new global at module scope. >> (get_tls_wrapper_fn): Record new global at module scope. >> (handle_tls_init): Skip in aux module, setup alias in exported >> primary module. >> >> testsuite/ >> Google ref b/19618364. >> * testsuite/g++.dg/tree-prof/lipo/tls.h: New test. >> * testsuite/g++.dg/tree-prof/lipo/tls2.h: Ditto. >> * testsuite/g++.dg/tree-prof/lipo/tls2_0.C: Ditto. >> * testsuite/g++.dg/tree-prof/lipo/tls2_1.C: Ditto. >> * testsuite/g++.dg/tree-prof/lipo/tls
Re: C++ PATCH for c++/65398 (valid constexpr rejected)
On 03/20/2015 11:11 AM, Jakub Jelinek wrote: &s[3] is the address of the terminal \0. Yeah, sure. But the above testcase does &s[4], which is out of bounds arithmetics But since &s[3] is the address of an element of the array (the NUL), &s[4] is one-past-the-end, which is fine. &s[5] would be out of bounds. I'm not saying it is absolutely necessary to handle this for GCC 5, but we at least should not treat the POINTER_PLUS_EXPR offset in the constexpr handling as unsigned, but think of it as signed, otherwise we reject even valid code - say constexpr char d = *(&s[3] - 1). Agreed. And I still think it makes sense to fix this as part of this patch. Jason
Re: C++ PATCH for c++/65398 (valid constexpr rejected)
On Fri, Mar 20, 2015 at 11:02:59AM -0400, Jason Merrill wrote: > On 03/20/2015 10:59 AM, Jakub Jelinek wrote: > >On Fri, Mar 20, 2015 at 10:53:50AM -0400, Jason Merrill wrote: > >>On 03/19/2015 02:05 PM, Jakub Jelinek wrote: > >>>Though, a question is if we do (or, if we don't and should) reject say > >>>constexpr char s[] = "abc"; > >>>constexpr int j = 4; > >>>constexpr char c = *(&s[j] - 2); > >>>because there was out of bound access in there. > >> > >>I don't see an out-of-bound access in this example; taking the address of > >>one-past-the-end is OK as long as you don't try to access through it. > > > >It is taking address of two past the end though - &s[3] is fine, sure. > >But &s[4] is invalid already. > > &s[3] is the address of the terminal \0. Yeah, sure. But the above testcase does &s[4], which is out of bounds arithmetics, and then subtracts 2 to point it back into range. I'm not saying it is absolutely necessary to handle this for GCC 5, but we at least should not treat the POINTER_PLUS_EXPR offset in the constexpr handling as unsigned, but think of it as signed, otherwise we reject even valid code - say constexpr char d = *(&s[3] - 1). Jakub
Re: C++ PATCH for c++/65398 (valid constexpr rejected)
On 03/20/2015 10:59 AM, Jakub Jelinek wrote: On Fri, Mar 20, 2015 at 10:53:50AM -0400, Jason Merrill wrote: On 03/19/2015 02:05 PM, Jakub Jelinek wrote: Though, a question is if we do (or, if we don't and should) reject say constexpr char s[] = "abc"; constexpr int j = 4; constexpr char c = *(&s[j] - 2); because there was out of bound access in there. I don't see an out-of-bound access in this example; taking the address of one-past-the-end is OK as long as you don't try to access through it. It is taking address of two past the end though - &s[3] is fine, sure. But &s[4] is invalid already. &s[3] is the address of the terminal \0. Jason
Re: C++ PATCH for c++/65398 (valid constexpr rejected)
On Fri, Mar 20, 2015 at 10:53:50AM -0400, Jason Merrill wrote: > On 03/19/2015 02:05 PM, Jakub Jelinek wrote: > >Though, a question is if we do (or, if we don't and should) reject say > >constexpr char s[] = "abc"; > >constexpr int j = 4; > >constexpr char c = *(&s[j] - 2); > >because there was out of bound access in there. > > I don't see an out-of-bound access in this example; taking the address of > one-past-the-end is OK as long as you don't try to access through it. It is taking address of two past the end though - &s[3] is fine, sure. But &s[4] is invalid already. > >Unfortunately we reject even that (regardless the patch), and yeah, it's > >because of how POINTER_PLUS_EXPR uses sizetype as a type of the second > >operand. > > This seems like something to fix in this patch. > > >+ tree t = fold_convert_loc (loc, TREE_TYPE (op01), > >+ TREE_OPERAND (op00, 1)); > >+ t = size_binop_loc (loc, PLUS_EXPR, op01, t); > > This seems to be assuming that the elements are size 1. Jakub
Re: C++ PATCH for c++/65398 (valid constexpr rejected)
On 03/19/2015 02:05 PM, Jakub Jelinek wrote: Though, a question is if we do (or, if we don't and should) reject say constexpr char s[] = "abc"; constexpr int j = 4; constexpr char c = *(&s[j] - 2); because there was out of bound access in there. I don't see an out-of-bound access in this example; taking the address of one-past-the-end is OK as long as you don't try to access through it. Unfortunately we reject even that (regardless the patch), and yeah, it's because of how POINTER_PLUS_EXPR uses sizetype as a type of the second operand. This seems like something to fix in this patch. + tree t = fold_convert_loc (loc, TREE_TYPE (op01), +TREE_OPERAND (op00, 1)); + t = size_binop_loc (loc, PLUS_EXPR, op01, t); This seems to be assuming that the elements are size 1. Jason
Re: [patch c++]: Fix for PR/65390
Hello, the problem here is that for cases of vla-array-types, the types don't get finally layouted in build_cplus_array_type. So the type-alignment isn't set in such cases for the resulting type. ChangeLog 2015-03-20 Kai Tietz PR c++/65390 * tree.c (strip_typedefs): Ignore alignment difference during processing template. 2015-03-20 Kai Tietz PR c++/65390 * g++.dg/template/pr65390.C: New file. Tested on x86_64-unknown-linux-gnu. Ok for apply? Regards, Kai Index: gcc/gcc/cp/tree.c === --- gcc.orig/gcc/cp/tree.c +++ gcc/gcc/cp/tree.c @@ -1356,7 +1356,8 @@ strip_typedefs (tree t) if (!result) result = TYPE_MAIN_VARIANT (t); if (TYPE_USER_ALIGN (t) != TYPE_USER_ALIGN (result) - || TYPE_ALIGN (t) != TYPE_ALIGN (result)) + || (processing_template_decl + && TYPE_ALIGN (t) != TYPE_ALIGN (result))) { gcc_assert (TYPE_USER_ALIGN (t)); if (TYPE_ALIGN (t) == TYPE_ALIGN (result)) Index: gcc/gcc/testsuite/g++.dg/template/pr65390.C === --- /dev/null +++ gcc/gcc/testsuite/g++.dg/template/pr65390.C @@ -0,0 +1,12 @@ +// { dg-do compile } +// { dg-options "-Wno-vla" } +template struct shared_ptr { }; + +template +shared_ptr make_shared(Arg) { return shared_ptr(); } // { dg-message "note" } +// { dg-warning "ignoring attributes" "template" { target *-*-* } 6 } + +void f(int n){ + make_shared(1); // { dg-error "no matching" } +} +// { dg-error "variably modified type|trying to instantiate" "type" { target *-*-* } 10 }
Re: C++ PATCH for c++/65072 (ICE with anonymous aggregate and decltype)
On 03/20/2015 08:59 AM, Richard Biener wrote: Hum... variant types should have the same TYPE_FIELDs as the main variant. So why's the variants TYPE_FIELDS NULL? Because we're in the middle of parsing the class, so we haven't called fixup_type_variants to copy TYPE_FIELDS yet. The patch is OK. Jason
[Patch, Fortran, Backport to 4.9, pr60255] [OOP] Deferred character length variable at (1) cannot yet be associated with unlimited polymorphic entities
Hi all, I was now asked several times when this basic version for deferred length char arrays in unlimited polymorphic entities will be backported to 4.9. Please note, I was not asked *if* I backport it, but *when*. So here it is. The patch is essential the same like for 5.0, besides that some formating, indentation and line ending issues have been fixed. Furthermore, is the small patch from https://gcc.gnu.org/ml/fortran/2015-02/msg5.html concerning the typo in the length-size integrated. Bootstraps and regtests ok on x86_64-linux-gnu/F20. Ok for 4.9-trunk? Regards, Andre -- Andre Vehreschild * Email: vehre ad gmx dot de pr60255-4.9.clog Description: Binary data diff --git a/gcc/fortran/class.c b/gcc/fortran/class.c index aee9666..a8212f7 100644 --- a/gcc/fortran/class.c +++ b/gcc/fortran/class.c @@ -34,6 +34,12 @@ along with GCC; see the file COPYING3. If not see (pointer/allocatable/dimension/...). * _vptr: A pointer to the vtable entry (see below) of the dynamic type. +Only for unlimited polymorphic classes: +* _len: An integer(4) to store the string length when the unlimited + polymorphic pointer is used to point to a char array. The '_len' + component will be zero when no character array is stored in + '_data'. + For each derived type we set up a "vtable" entry, i.e. a structure with the following fields: * _hash: A hash value serving as a unique identifier for this type. @@ -544,10 +550,48 @@ gfc_intrinsic_hash_value (gfc_typespec *ts) } +/* Get the _len component from a class/derived object storing a string. + For unlimited polymorphic entities a ref to the _data component is available + while a ref to the _len component is needed. This routine traverese the + ref-chain and strips the last ref to a _data from it replacing it with a + ref to the _len component. */ + +gfc_expr * +gfc_get_len_component (gfc_expr *e) +{ + gfc_expr *ptr; + gfc_ref *ref, **last; + + ptr = gfc_copy_expr (e); + + /* We need to remove the last _data component ref from ptr. */ + last = &(ptr->ref); + ref = ptr->ref; + while (ref) +{ + if (!ref->next + && ref->type == REF_COMPONENT + && strcmp ("_data", ref->u.c.component->name)== 0) + { + gfc_free_ref_list (ref); + *last = NULL; + break; + } + last = &(ref->next); + ref = ref->next; +} + /* And replace if with a ref to the _len component. */ + gfc_add_component_ref (ptr, "_len"); + return ptr; +} + + /* Build a polymorphic CLASS entity, using the symbol that comes from build_sym. A CLASS entity is represented by an encapsulating type, which contains the declared type as '_data' component, plus a pointer - component '_vptr' which determines the dynamic type. */ + component '_vptr' which determines the dynamic type. When this CLASS + entity is unlimited polymorphic, then also add a component '_len' to + store the length of string when that is stored in it. */ bool gfc_build_class_symbol (gfc_typespec *ts, symbol_attribute *attr, @@ -645,19 +689,28 @@ gfc_build_class_symbol (gfc_typespec *ts, symbol_attribute *attr, if (!gfc_add_component (fclass, "_vptr", &c)) return false; c->ts.type = BT_DERIVED; + c->attr.access = ACCESS_PRIVATE; + c->attr.pointer = 1; if (ts->u.derived->attr.unlimited_polymorphic) { vtab = gfc_find_derived_vtab (ts->u.derived); gcc_assert (vtab); c->ts.u.derived = vtab->ts.u.derived; + + /* Add component '_len'. Only unlimited polymorphic pointers may + have a string assigned to them, i.e., only those need the _len + component. */ + if (!gfc_add_component (fclass, "_len", &c)) + return false; + c->ts.type = BT_INTEGER; + c->ts.kind = 4; + c->attr.access = ACCESS_PRIVATE; + c->attr.artificial = 1; } else /* Build vtab later. */ c->ts.u.derived = NULL; - - c->attr.access = ACCESS_PRIVATE; - c->attr.pointer = 1; } if (!ts->u.derived->attr.unlimited_polymorphic) @@ -2434,18 +2487,9 @@ find_intrinsic_vtab (gfc_typespec *ts) gfc_symbol *copy = NULL, *src = NULL, *dst = NULL; int charlen = 0; - if (ts->type == BT_CHARACTER) -{ - if (ts->deferred) - { - gfc_error ("TODO: Deferred character length variable at %C cannot " - "yet be associated with unlimited polymorphic entities"); - return NULL; - } - else if (ts->u.cl && ts->u.cl->length - && ts->u.cl->length->expr_type == EXPR_CONSTANT) - charlen = mpz_get_si (ts->u.cl->length->value.integer); -} + if (ts->type == BT_CHARACTER && !ts->deferred && ts->u.cl && ts->u.cl->length + && ts->u.cl->length->expr_type == EXPR_CONSTANT) +charlen = mpz_get_si (ts->u.cl->length->value.integer); /* Find the top-level namespace. */ for (ns = gfc_current_ns; ns; ns = ns->parent) diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h index a193f53..8cc
Re: [PATCH] Speed-up def_builtin_const (ix86_valid_target_attribute)
On 03/20/2015 12:24 PM, Jakub Jelinek wrote: On Fri, Mar 20, 2015 at 11:33:09AM +0100, Martin Liška wrote: @@ -30670,6 +30673,20 @@ def_builtin_const (HOST_WIDE_INT mask, const char *name, static void ix86_add_new_builtins (HOST_WIDE_INT isa) { + /* Last cached isa value. */ + static HOST_WIDE_INT last_tested_isa_value = 0; + + /* We iterate through all defined builtins just if the last tested + values is different from ISA and just in case there is any intersection + between ISA value and union of all possible configurations. + Last condition skips iterations if ISA is changed by the change has + empty intersection with defined_isa_values. */ + if ((isa & defined_isa_values) == 0 || isa == last_tested_isa_value + || ((isa ^ last_tested_isa_value) & defined_isa_values) == 0) +return; + + last_tested_isa_value = isa; Isn't at least the isa == last_tested_isa_value test useless? I mean, if they are equal, then (isa ^ last_tested_isa_value) is 0 and so the third test is true. Also, given that the loop does something only for (ix86_builtins_isa[i].isa & isa) != 0 it means once you call ix86_add_new_builtins with some particular bit set in the isa, it doesn't make sense to try that bit again. So, I think it would be better: 1) rename defined_isa_values bitmask to say deferred_isa_values (as in, isa bits that might still enable any new builtins) 2) in def_builtin, don't or in the mask unconditionally, but only in the else case - when add_builtin_function is not called 3) in ix86_add_new_builtins, don't add last_tested_isa_value at all, instead do: if ((isa & deferred_isa_values) == 0) return; deferred_isa_values &= ~isa; That way, when you actually enable all builtins (either immediately in def_builtin because it wasn't C/C++-like FE, or because all ISAs were enabled from the start, or because ix86_add_new_builtins has been already called with sufficiently full bitmask), deferred_isa_values will be 0 and you won't do anything in the function any more. Jakub Thank you Jakub for smart solution. I've just implemented new patch, which I've been testing. What do you think about it? Martin >From 3469163018d3c8ad2849bb9d241e12ae8723da90 Mon Sep 17 00:00:00 2001 From: mliska Date: Fri, 20 Mar 2015 14:51:47 +0100 Subject: [PATCH] Speed-up def_builtin_const (ix86_valid_target_attribute). gcc/ChangeLog: 2015-03-20 Martin Liska Jakub Jelinek * config/i386/i386.c (def_builtin): Set deferred_isa_values for masks that can potentially include a builtin. (ix86_add_new_builtins): Introduce fast filter for isa values that cannot trigger builtin inclusion. --- gcc/config/i386/i386.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 47deda7..82a4848 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -30588,6 +30588,8 @@ struct builtin_isa { static struct builtin_isa ix86_builtins_isa[(int) IX86_BUILTIN_MAX]; +/* Bits that can still enable any inclusion of a builtin. */ +static HOST_WIDE_INT deferred_isa_values = 0; /* Add an ix86 target builtin function with CODE, NAME and TYPE. Save the MASK of which isa_flags to use in the ix86_builtins_isa array. Stores the @@ -30631,6 +30633,9 @@ def_builtin (HOST_WIDE_INT mask, const char *name, } else { + /* Just a MASK where set_and_not_built_p == true can potentially + include a builtin. */ + deferred_isa_values |= mask; ix86_builtins[(int) code] = NULL_TREE; ix86_builtins_isa[(int) code].tcode = tcode; ix86_builtins_isa[(int) code].name = name; @@ -30666,6 +30671,12 @@ def_builtin_const (HOST_WIDE_INT mask, const char *name, static void ix86_add_new_builtins (HOST_WIDE_INT isa) { + if ((isa & deferred_isa_values) == 0) +return; + + /* Bits in ISA value can be removed from potential isa values. */ + deferred_isa_values &= ~isa; + int i; tree saved_current_target_pragma = current_target_pragma; current_target_pragma = NULL_TREE; -- 2.1.2
[Patch, Fortran, v1] Cosmetics and code simplify
Hi all, during checking that pr 61275 is really fixed, I found a indentation issue and a piece of my former code, that could be done nicer and more readable. This patch addresses both these issues. Bootstraps and regtests ok on x86_64-linux-gnu/F20. Ok for trunk? Regards, Andre -- Andre Vehreschild * Email: vehre ad gmx dot de cosm_simp.clog Description: Binary data diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c index a30c391..f6fe9a1 100644 --- a/gcc/fortran/trans-expr.c +++ b/gcc/fortran/trans-expr.c @@ -7111,7 +7111,7 @@ gfc_conv_structure (gfc_se * se, gfc_expr * expr, int init) of EXPR_NULL,... by default, the static nullify is not needed since this is done every time we come into scope. */ if (!c->expr || (cm->attr.allocatable && cm->attr.flavor != FL_PROCEDURE)) -continue; + continue; if (cm->initializer && cm->initializer->expr_type != EXPR_NULL && strcmp (cm->name, "_extends") == 0 @@ -7132,13 +7132,9 @@ gfc_conv_structure (gfc_se * se, gfc_expr * expr, int init) val)); } else if (cm->ts.type == BT_INTEGER && strcmp (cm->name, "_len") == 0) -{ - gfc_expr *e = gfc_get_int_expr (gfc_default_integer_kind, NULL, 0); - val = gfc_conv_constant_to_tree (e); - CONSTRUCTOR_APPEND_ELT (v, cm->backend_decl, - fold_convert (TREE_TYPE (cm->backend_decl), -val)); -} + CONSTRUCTOR_APPEND_ELT (v, cm->backend_decl, +fold_convert (TREE_TYPE (cm->backend_decl), + integer_zero_node)); else { val = gfc_conv_initializer (c->expr, &cm->ts,
Re: C++ PATCH for c++/65072 (ICE with anonymous aggregate and decltype)
On Fri, Mar 20, 2015 at 01:59:14PM +0100, Richard Biener wrote: > Hum... variant types should have the same TYPE_FIELDs as the main variant. > So why's the variants TYPE_FIELDS NULL? That is a good question, but unfortunately, even after hours trying to find that out, I don't know. I thought that when we're looking for a FIELD_DECL, we should use the cv-unqualified type. Marek
Re: [patch] c++/65046 Use abi-tag to fix building with -Werror=abi-tag
On 18/03/15 18:16 +, Jonathan Wakely wrote: This adds the abi_tag attribute in a few places it's missing, so that -Werror=abi-tag builds will work after Jason's upcoming patches to make functions inherit the tag from their return type. Jason suggested using an inline namespace instead of tagging types, so this adds __gnu_cxx::__cxx11, similar to the existing std::__cxx11. Tested x86_64-linux, committed to trunk. commit de4ceb0c3bc9cc833c044683c9d94ccb60812359 Author: Jonathan Wakely Date: Fri Mar 20 10:49:18 2015 + * include/bits/c++config (__gnu_cxx::__cxx11): Define new namespace. * include/ext/codecvt_specializations.h (encoding_state, encoding_char_traits): Remove abi-tag and use inline namespace. * testsuite/ext/profile/mutex_extensions_neg.cc: Adjust dg-error line. diff --git a/libstdc++-v3/include/bits/c++config b/libstdc++-v3/include/bits/c++config index eebe34c..ae3065f 100644 --- a/libstdc++-v3/include/bits/c++config +++ b/libstdc++-v3/include/bits/c++config @@ -217,6 +217,10 @@ namespace std { inline namespace __cxx11 __attribute__((__abi_tag__ ("cxx11"))) { } } +namespace __gnu_cxx +{ + inline namespace __cxx11 __attribute__((__abi_tag__ ("cxx11"))) { } +} # define _GLIBCXX_NAMESPACE_CXX11 __cxx11:: # define _GLIBCXX_BEGIN_NAMESPACE_CXX11 namespace __cxx11 { # define _GLIBCXX_END_NAMESPACE_CXX11 } diff --git a/libstdc++-v3/include/ext/codecvt_specializations.h b/libstdc++-v3/include/ext/codecvt_specializations.h index d9f6630..34e90bd 100644 --- a/libstdc++-v3/include/ext/codecvt_specializations.h +++ b/libstdc++-v3/include/ext/codecvt_specializations.h @@ -41,13 +41,14 @@ namespace __gnu_cxx _GLIBCXX_VISIBILITY(default) { +_GLIBCXX_BEGIN_NAMESPACE_CXX11 _GLIBCXX_BEGIN_NAMESPACE_VERSION /// Extension to use iconv for dealing with character encodings. // This includes conversions and comparisons between various character // sets. This object encapsulates data that may need to be shared between // char_traits, codecvt and ctype. - class _GLIBCXX_DEFAULT_ABI_TAG encoding_state + class encoding_state { public: // Types: @@ -207,7 +208,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // associated fpos for the position type, all other // bits equivalent to the required char_traits instantiations. template -struct _GLIBCXX_DEFAULT_ABI_TAG encoding_char_traits +struct encoding_char_traits : public std::char_traits<_CharT> { typedef encoding_statestate_type; @@ -215,6 +216,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION }; _GLIBCXX_END_NAMESPACE_VERSION +_GLIBCXX_END_NAMESPACE_CXX11 } // namespace diff --git a/libstdc++-v3/testsuite/ext/profile/mutex_extensions_neg.cc b/libstdc++-v3/testsuite/ext/profile/mutex_extensions_neg.cc index d7a0c56..dd19f14 100644 --- a/libstdc++-v3/testsuite/ext/profile/mutex_extensions_neg.cc +++ b/libstdc++-v3/testsuite/ext/profile/mutex_extensions_neg.cc @@ -25,4 +25,4 @@ #include -// { dg-error "multiple inlined namespaces" "" { target *-*-* } 318 } +// { dg-error "multiple inlined namespaces" "" { target *-*-* } 322 }
Re: C++ PATCH for c++/65072 (ICE with anonymous aggregate and decltype)
On Fri, Mar 20, 2015 at 1:48 PM, Marek Polacek wrote: > The problem turned out to be that lookup_anon_field was not able to find > a FIELD_DECL so we passed NULL member to finish_class_member_access_expr > and that SEGVs on that. The reason is that "const struct C" object does > not have any fields; for that, we need to look at the main variant. So > fixed by doing just that, lookup_anon_field is then able to find the decl > and we don't ICE anymore. > > Bootstrapped/regtested on x86_64-linux, ok for trunk? Strictly speaking, > it's not a regression, but the fix doesn't look terribly risky either. Hum... variant types should have the same TYPE_FIELDs as the main variant. So why's the variants TYPE_FIELDS NULL? Richard. > 2015-03-20 Marek Polacek > > PR c++/65072 > * typeck.c (lookup_anon_field): Make sure we're dealing with the main > variant. > > * g++.dg/cpp0x/pr65072.C: New test. > > diff --git gcc/cp/typeck.c gcc/cp/typeck.c > index 4c128b7..e9d4cae 100644 > --- gcc/cp/typeck.c > +++ gcc/cp/typeck.c > @@ -2213,6 +2213,8 @@ lookup_anon_field (tree t, tree type) > { >tree field; > > + t = TYPE_MAIN_VARIANT (t); > + >for (field = TYPE_FIELDS (t); field; field = DECL_CHAIN (field)) > { >if (TREE_STATIC (field)) > diff --git gcc/testsuite/g++.dg/cpp0x/pr65072.C > gcc/testsuite/g++.dg/cpp0x/pr65072.C > index e69de29..b8fa888 100644 > --- gcc/testsuite/g++.dg/cpp0x/pr65072.C > +++ gcc/testsuite/g++.dg/cpp0x/pr65072.C > @@ -0,0 +1,14 @@ > +// PR c++/65075 > +// { dg-do compile { target c++11 } } > +// { dg-options "-Wno-pedantic" } > + > +template class C > +{ > + struct > + { > +int i; > + }; > + auto operator*(const C m) -> decltype (m.i); > +}; > +void fn1 (const C); > +C a; > > Marek
C++ PATCH for c++/65072 (ICE with anonymous aggregate and decltype)
The problem turned out to be that lookup_anon_field was not able to find a FIELD_DECL so we passed NULL member to finish_class_member_access_expr and that SEGVs on that. The reason is that "const struct C" object does not have any fields; for that, we need to look at the main variant. So fixed by doing just that, lookup_anon_field is then able to find the decl and we don't ICE anymore. Bootstrapped/regtested on x86_64-linux, ok for trunk? Strictly speaking, it's not a regression, but the fix doesn't look terribly risky either. 2015-03-20 Marek Polacek PR c++/65072 * typeck.c (lookup_anon_field): Make sure we're dealing with the main variant. * g++.dg/cpp0x/pr65072.C: New test. diff --git gcc/cp/typeck.c gcc/cp/typeck.c index 4c128b7..e9d4cae 100644 --- gcc/cp/typeck.c +++ gcc/cp/typeck.c @@ -2213,6 +2213,8 @@ lookup_anon_field (tree t, tree type) { tree field; + t = TYPE_MAIN_VARIANT (t); + for (field = TYPE_FIELDS (t); field; field = DECL_CHAIN (field)) { if (TREE_STATIC (field)) diff --git gcc/testsuite/g++.dg/cpp0x/pr65072.C gcc/testsuite/g++.dg/cpp0x/pr65072.C index e69de29..b8fa888 100644 --- gcc/testsuite/g++.dg/cpp0x/pr65072.C +++ gcc/testsuite/g++.dg/cpp0x/pr65072.C @@ -0,0 +1,14 @@ +// PR c++/65075 +// { dg-do compile { target c++11 } } +// { dg-options "-Wno-pedantic" } + +template class C +{ + struct + { +int i; + }; + auto operator*(const C m) -> decltype (m.i); +}; +void fn1 (const C); +C a; Marek
Re: [PATCH][2/3][PR65458] Mark omp thread functions as parallelized
On Fri, Mar 20, 2015 at 01:30:48PM +0100, Tom de Vries wrote: > On 20-03-15 12:57, Jakub Jelinek wrote: > >>@@ -5569,6 +5569,7 @@ expand_omp_taskreg (struct omp_region *region) > >>>/* Inform the callgraph about the new function. */ > >>>DECL_STRUCT_FUNCTION (child_fn)->curr_properties = > >>> cfun->curr_properties; > >>>cgraph_node::add_new_function (child_fn, true); > >>>+ cgraph_node::get (child_fn)->parallelized_function = 1; > >IMHO you want to do this in create_omp_child_function instead, > > That way, the patch would no longer work for parloops. The current location > of the fix is triggered by both parloops, and the omp-in-source processing. > > >that way you > >handle it not just for the parallel region, but also e.g. for the task copy > >functions etc. > > I propose to handle task copy like this: > ... > diff --git a/gcc/omp-low.c b/gcc/omp-low.c > index 9be39b7..f8b5f85 100644 > --- a/gcc/omp-low.c > +++ b/gcc/omp-low.c > @@ -1564,6 +1564,7 @@ finalize_task_copyfn (gomp_task *task_stmt) > >/* Inform the callgraph about the new function. */ >cgraph_node::add_new_function (child_fn, false); > + cgraph_node::get (child_fn)->parallelized_function = 1; > } > > /* Destroy a omp_context data structures. Called through the splay tree > ... > > OK if retesting succeeds? Ok. Jakub
Re: [PATCH][2/3][PR65458] Mark omp thread functions as parallelized
On 20-03-15 12:57, Jakub Jelinek wrote: @@ -5569,6 +5569,7 @@ expand_omp_taskreg (struct omp_region *region) >/* Inform the callgraph about the new function. */ >DECL_STRUCT_FUNCTION (child_fn)->curr_properties = cfun->curr_properties; >cgraph_node::add_new_function (child_fn, true); >+ cgraph_node::get (child_fn)->parallelized_function = 1; IMHO you want to do this in create_omp_child_function instead, That way, the patch would no longer work for parloops. The current location of the fix is triggered by both parloops, and the omp-in-source processing. that way you handle it not just for the parallel region, but also e.g. for the task copy functions etc. I propose to handle task copy like this: ... diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 9be39b7..f8b5f85 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -1564,6 +1564,7 @@ finalize_task_copyfn (gomp_task *task_stmt) /* Inform the callgraph about the new function. */ cgraph_node::add_new_function (child_fn, false); + cgraph_node::get (child_fn)->parallelized_function = 1; } /* Destroy a omp_context data structures. Called through the splay tree ... OK if retesting succeeds? Thanks, - Tom
Re: [PATCH] Enable fixing PR64715
On Fri, Mar 20, 2015 at 12:38:51PM +0100, Richard Biener wrote: > > This removes a road-block towards fixing PR64715 - the gimplifier > performing an undesired "simplification" of &X + CST to > &MEM[&X, CST]. IMHO this is premature there. > > The fallout is small - two testcases are no longer simplified at -O0 > (who cares), and devirt-40.C shows that SCEV adheres too closely > to the fold view of type compatibility (I had a fix for that in > my dev tree for quite some time it seeems). Of course the latter > means that there might be more fallout in code that builds > GENERIC and then re-gimplifies that, but I can't think of anything > that wouldn't be fixed with followup passes. > > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. > > Ok? LGTM. > 2015-03-20 Richard Biener > > PR middle-end/64715 > * tree-chrec.c (chrec_fold_poly_cst): Use useless_type_conversion_p > for type comparison and gcc_checking_assert. > (chrec_fold_plus_poly_poly): Likewise. > (chrec_fold_multiply_poly_poly): Likewise. > (chrec_convert_1): Likewise. > * gimplify.c (gimplify_expr): Remove premature folding of > &X + CST to &MEM[&X, CST]. > > * gcc.dg/pr15347.c: Use -O. > * c-c++-common/pr19807-1.c: Likewise. Jakub
GCC 5 Status Report (2015-03-20)
Status == The trunk is open for regression and documentation fixes only. We've come a long way towards the release criteria of zero P1 bugs. There are still a few remaining P1s though and we are targeting for a GCC 5 release candidate in the first week of April (given those P1 bugs are either fixed or demoted to P2). Please make sure your non-primary/secondary ports build and look for the low-hanging fruits in cleaning up your ports testsuite results. Please also prioritize fixing P1s and avoid pushing in risky fixes for P2s that might end up causing regressions. We are still seeing way too many changes in the IPA area (hi Honza!). Backporting a trunk fix for GCC 5.2 is a more reasonable thing for this kind of fixes at this point. Quality Data Priority # Change from last report --- --- P17- 25 P2 70- 27 P39- 23 P4 80 P5 36 --- --- Total P1-P3 86- 75 Total 202 Previous Report === https://gcc.gnu.org/ml/gcc/2015-01/msg00156.html
Re: [PATCH, GCC, stage1] Fallback to copy-prop if constant-prop not possible
On Fri, Mar 20, 2015 at 11:27 AM, Thomas Preud'homme wrote: > Sorry, I missed the parenthesis. REG_P needs indeed to be kept. I'd be > tempted to use !HARD_REGISTER_P instead since REG_P is already > checked but I don't mind either way. I put the cprop_reg_p check there instead of !HARD_REGISTER_P because I like to be able to quickly find all places where a similar check is performed. The check is whether the reg is something that copy propagation can handle, and that is what I added cprop_reg_p for. (Note that cprop can _currently_ handle only pseudos but there is no reason why a limited set of hard regs can't be handled also, e.g. the flag registers like in targetm.fixed_condition_code_regs). In this case, the result is that REG_P is checked twice. But then again, cprop_reg_p will be inlined and the double check optimized away. Anyway, I guess we've bikeshedded long enough over this patch as it is :-) Let's post a final form and declare it OK for stage1. As for PSEUDO_REG_P: If it were up to me, I'd like to have in rtl.h: static bool hard_register_p (rtx x) { return (REG_P (x) && HARD_REGISTER_NUM_P (REGNO (x))); } static bool pseudo_register_p (rtx x) { return (REG_P (x) && !HARD_REGISTER_NUM_P (REGNO (x))); } and do away with all the FIRST_PSEUDO_REGISTER tests. But I've proposed this in the past and there was opposition. Perhaps when we introduce a rtx_reg class... Ciao! Steven
Re: [PATCH][2/3][PR65458] Mark omp thread functions as parallelized
On Fri, Mar 20, 2015 at 12:37:11PM +0100, Tom de Vries wrote: > Mark omp thread functions as parallelized > > 2015-03-20 Tom de Vries > > PR tree-optimization/65458 > * cgraph.c (cgraph_node::dump): Handle parallelized_function field. > * cgraph.h (cgraph_node): Add parallelized_function field. > * lto-cgraph.c (lto_output_node): Write parallelized_function field. > (input_overwrite_node): Read parallelized_function field. > * omp-low.c (expand_omp_taskreg): Set parallelized_function on > cgraph_node for child_fn. > * tree-parloops.c: Add include of plugin-api.h, ipa-ref.h and cgraph.h. > Remove include of gt-tree-parloops.h. > (parallelized_functions): Remove static variable. > (parallelized_function_p): Rewrite using parallelized_function field of > cgraph_node. > (create_loop_fn): Remove adding to parallelized_functions. You should also remove tree-parloops.c from GTFILES in gcc/Makefile.in. > --- > gcc/cgraph.c| 2 ++ > gcc/cgraph.h| 2 ++ > gcc/lto-cgraph.c| 2 ++ > gcc/omp-low.c | 1 + > gcc/tree-parloops.c | 27 --- > 5 files changed, 15 insertions(+), 19 deletions(-) > > diff --git a/gcc/cgraph.c b/gcc/cgraph.c > index ede58bf..4573057 100644 > --- a/gcc/cgraph.c > +++ b/gcc/cgraph.c > @@ -2009,6 +2009,8 @@ cgraph_node::dump (FILE *f) > fprintf (f, " only_called_at_exit"); >if (opt_for_fn (decl, optimize_size)) > fprintf (f, " optimize_size"); > + if (parallelized_function) > +fprintf (f, " parallelized_function"); > >fprintf (f, "\n"); > > diff --git a/gcc/cgraph.h b/gcc/cgraph.h > index 52b15c5..650e689 100644 > --- a/gcc/cgraph.h > +++ b/gcc/cgraph.h > @@ -1317,6 +1317,8 @@ public: >unsigned nonfreeing_fn : 1; >/* True if there was multiple COMDAT bodies merged by lto-symtab. */ >unsigned merged : 1; > + /* True if function was created to be executed in parallel. */ > + unsigned parallelized_function : 1; > > private: >/* Worker for call_for_symbol_and_aliases. */ > diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c > index c875fed..088de86 100644 > --- a/gcc/lto-cgraph.c > +++ b/gcc/lto-cgraph.c > @@ -574,6 +574,7 @@ lto_output_node (struct lto_simple_output_block *ob, > struct cgraph_node *node, >bp_pack_value (&bp, node->icf_merged, 1); >bp_pack_value (&bp, node->nonfreeing_fn, 1); >bp_pack_value (&bp, node->thunk.thunk_p, 1); > + bp_pack_value (&bp, node->parallelized_function, 1); >bp_pack_enum (&bp, ld_plugin_symbol_resolution, > LDPR_NUM_KNOWN, node->resolution); >bp_pack_value (&bp, node->instrumentation_clone, 1); > @@ -1209,6 +1210,7 @@ input_overwrite_node (struct lto_file_decl_data > *file_data, >node->icf_merged = bp_unpack_value (bp, 1); >node->nonfreeing_fn = bp_unpack_value (bp, 1); >node->thunk.thunk_p = bp_unpack_value (bp, 1); > + node->parallelized_function = bp_unpack_value (bp, 1); >node->resolution = bp_unpack_enum (bp, ld_plugin_symbol_resolution, >LDPR_NUM_KNOWN); >node->instrumentation_clone = bp_unpack_value (bp, 1); > diff --git a/gcc/omp-low.c b/gcc/omp-low.c > index 48d73cb..5ca9e84 100644 > --- a/gcc/omp-low.c > +++ b/gcc/omp-low.c > @@ -5569,6 +5569,7 @@ expand_omp_taskreg (struct omp_region *region) >/* Inform the callgraph about the new function. */ >DECL_STRUCT_FUNCTION (child_fn)->curr_properties = > cfun->curr_properties; >cgraph_node::add_new_function (child_fn, true); > + cgraph_node::get (child_fn)->parallelized_function = 1; IMHO you want to do this in create_omp_child_function instead, that way you handle it not just for the parallel region, but also e.g. for the task copy functions etc. Ok with those changes. Jakub
Re: [PATCH][3/3][PR65460] Mark offloaded functions as parallelized
On 19-03-15 12:05, Tom de Vries wrote: On 18-03-15 18:22, Tom de Vries wrote: Hi, this patch fixes PR65460. The patch marks offloaded functions as parallelized, which means the parloops pass no longer attempts to modify that function. Updated patch to postpone mark_parallelized_function until the corresponding cgraph_node is available, to ensure it works with the updated mark_parallelized_function from patch 2/3. Updated to eliminate mark_parallelized_function. Bootstrapped and reg-tested on x86_64. OK for stage4? Thanks, - Tom Mark offloaded functions as parallelized 2015-03-20 Tom de Vries PR tree-optimization/65460 * omp-low.c (expand_omp_target): Set parallelized_function on cgraph_node for child_fn. --- gcc/omp-low.c | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 5ca9e84..9be39b7 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -8937,6 +8937,7 @@ expand_omp_target (struct omp_region *region) /* Inform the callgraph about the new function. */ DECL_STRUCT_FUNCTION (child_fn)->curr_properties = cfun->curr_properties; cgraph_node::add_new_function (child_fn, true); + cgraph_node::get (child_fn)->parallelized_function = 1; #ifdef ENABLE_OFFLOADING /* Add the new function to the offload table. */ -- 1.9.1
[PATCH] Enable fixing PR64715
This removes a road-block towards fixing PR64715 - the gimplifier performing an undesired "simplification" of &X + CST to &MEM[&X, CST]. IMHO this is premature there. The fallout is small - two testcases are no longer simplified at -O0 (who cares), and devirt-40.C shows that SCEV adheres too closely to the fold view of type compatibility (I had a fix for that in my dev tree for quite some time it seeems). Of course the latter means that there might be more fallout in code that builds GENERIC and then re-gimplifies that, but I can't think of anything that wouldn't be fixed with followup passes. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Ok? Thanks, Richard. 2015-03-20 Richard Biener PR middle-end/64715 * tree-chrec.c (chrec_fold_poly_cst): Use useless_type_conversion_p for type comparison and gcc_checking_assert. (chrec_fold_plus_poly_poly): Likewise. (chrec_fold_multiply_poly_poly): Likewise. (chrec_convert_1): Likewise. * gimplify.c (gimplify_expr): Remove premature folding of &X + CST to &MEM[&X, CST]. * gcc.dg/pr15347.c: Use -O. * c-c++-common/pr19807-1.c: Likewise. Index: gcc/tree-chrec.c === --- gcc/tree-chrec.c(revision 221515) +++ gcc/tree-chrec.c(working copy) @@ -78,8 +78,8 @@ chrec_fold_poly_cst (enum tree_code code gcc_assert (poly); gcc_assert (cst); gcc_assert (TREE_CODE (poly) == POLYNOMIAL_CHREC); - gcc_assert (!is_not_constant_evolution (cst)); - gcc_assert (type == chrec_type (poly)); + gcc_checking_assert (!is_not_constant_evolution (cst)); + gcc_checking_assert (useless_type_conversion_p (type, chrec_type (poly))); switch (code) { @@ -124,10 +124,11 @@ chrec_fold_plus_poly_poly (enum tree_cod gcc_assert (TREE_CODE (poly0) == POLYNOMIAL_CHREC); gcc_assert (TREE_CODE (poly1) == POLYNOMIAL_CHREC); if (POINTER_TYPE_P (chrec_type (poly0))) -gcc_assert (ptrofftype_p (chrec_type (poly1))); +gcc_checking_assert (ptrofftype_p (chrec_type (poly1)) +&& useless_type_conversion_p (type, chrec_type (poly0))); else -gcc_assert (chrec_type (poly0) == chrec_type (poly1)); - gcc_assert (type == chrec_type (poly0)); +gcc_checking_assert (useless_type_conversion_p (type, chrec_type (poly0)) +&& useless_type_conversion_p (type, chrec_type (poly1))); /* {a, +, b}_1 + {c, +, d}_2 -> {{a, +, b}_1 + c, +, d}_2, @@ -208,8 +209,8 @@ chrec_fold_multiply_poly_poly (tree type gcc_assert (poly1); gcc_assert (TREE_CODE (poly0) == POLYNOMIAL_CHREC); gcc_assert (TREE_CODE (poly1) == POLYNOMIAL_CHREC); - gcc_assert (chrec_type (poly0) == chrec_type (poly1)); - gcc_assert (type == chrec_type (poly0)); + gcc_checking_assert (useless_type_conversion_p (type, chrec_type (poly0)) + && useless_type_conversion_p (type, chrec_type (poly1))); /* {a, +, b}_1 * {c, +, d}_2 -> {c*{a, +, b}_1, +, d}_2, {a, +, b}_2 * {c, +, d}_1 -> {a*{c, +, d}_1, +, b}_2, @@ -1352,7 +1353,7 @@ chrec_convert_1 (tree type, tree chrec, return chrec; ct = chrec_type (chrec); - if (ct == type) + if (useless_type_conversion_p (type, ct)) return chrec; if (!evolution_function_is_affine_p (chrec)) Index: gcc/testsuite/gcc.dg/pr15347.c === --- gcc/testsuite/gcc.dg/pr15347.c (revision 221530) +++ gcc/testsuite/gcc.dg/pr15347.c (working copy) @@ -1,4 +1,5 @@ /* { dg-do link } */ +/* { dg-options "-O" } */ extern void link_error (void); int Index: gcc/testsuite/c-c++-common/pr19807-1.c === --- gcc/testsuite/c-c++-common/pr19807-1.c (revision 221530) +++ gcc/testsuite/c-c++-common/pr19807-1.c (working copy) @@ -1,4 +1,5 @@ /* { dg-do link } */ +/* { dg-options "-O" } */ extern void link_error(void); int main() Index: gcc/gimplify.c === *** gcc/gimplify.c (revision 221514) --- gcc/gimplify.c (working copy) *** gimplify_expr (tree *expr_p, gimple_seq *** 8524,8546 post_p, is_gimple_val, fb_rvalue); recalculate_side_effects (*expr_p); ret = MIN (r0, r1); - /* Convert &X + CST to invariant &MEM[&X, CST]. Do this - after gimplifying operands - this is similar to how - it would be folding all gimplified stmts on creation - to have them canonicalized, which is what we eventually - should do anyway. */ - if (TREE_CODE (TREE_OPERAND (*expr_p, 1)) == INTEGER_CST - && is_gimple_min_invariant (TREE_OPERAND (*expr_p, 0))) - { - *expr_p = build_fold_addr_expr_with_type_loc - (input_locat
Re: Make messages_members.cc Catalog_info and Catalogs ABI agnostic
On 19/03/15 21:50 +0100, François Dumont wrote: On 18/03/2015 19:16, Jonathan Wakely wrote: Preparing this patch reminded me that we currently have two copies of the Catalog_info and Catalogs code in the unnamed namespace in config/locale/gnu/messages_members.cc, one using the old string and one using the new. We should really alter the code to not use std::string so that the catalogs can be shared by both versions of the messages facets. Hello Do you mean like the attached patch ? Or do I need to isolate get_catalogs function in a dedicated source file that won't be built twice ? That's a partial fix, but the types and functions are still in an anonymous namespace, so will still be duplicated. To do it correctly they can't be in an anon namespace and the names must be uglified.
Re: [PATCH][2/3][PR65458] Mark omp thread functions as parallelized
On 19-03-15 12:29, Jakub Jelinek wrote: On Thu, Mar 19, 2015 at 12:27:04PM +0100, Tom de Vries wrote: Sure, I can update that, I'll retest and repost. Yes, please. Probably the tree-parloops.h include will not be needed either then. Updated to eliminate mark_parallelized_function. Bootstrapped and reg-tested on x86_64. OK for stage4? Thanks, - Tom Mark omp thread functions as parallelized 2015-03-20 Tom de Vries PR tree-optimization/65458 * cgraph.c (cgraph_node::dump): Handle parallelized_function field. * cgraph.h (cgraph_node): Add parallelized_function field. * lto-cgraph.c (lto_output_node): Write parallelized_function field. (input_overwrite_node): Read parallelized_function field. * omp-low.c (expand_omp_taskreg): Set parallelized_function on cgraph_node for child_fn. * tree-parloops.c: Add include of plugin-api.h, ipa-ref.h and cgraph.h. Remove include of gt-tree-parloops.h. (parallelized_functions): Remove static variable. (parallelized_function_p): Rewrite using parallelized_function field of cgraph_node. (create_loop_fn): Remove adding to parallelized_functions. --- gcc/cgraph.c| 2 ++ gcc/cgraph.h| 2 ++ gcc/lto-cgraph.c| 2 ++ gcc/omp-low.c | 1 + gcc/tree-parloops.c | 27 --- 5 files changed, 15 insertions(+), 19 deletions(-) diff --git a/gcc/cgraph.c b/gcc/cgraph.c index ede58bf..4573057 100644 --- a/gcc/cgraph.c +++ b/gcc/cgraph.c @@ -2009,6 +2009,8 @@ cgraph_node::dump (FILE *f) fprintf (f, " only_called_at_exit"); if (opt_for_fn (decl, optimize_size)) fprintf (f, " optimize_size"); + if (parallelized_function) +fprintf (f, " parallelized_function"); fprintf (f, "\n"); diff --git a/gcc/cgraph.h b/gcc/cgraph.h index 52b15c5..650e689 100644 --- a/gcc/cgraph.h +++ b/gcc/cgraph.h @@ -1317,6 +1317,8 @@ public: unsigned nonfreeing_fn : 1; /* True if there was multiple COMDAT bodies merged by lto-symtab. */ unsigned merged : 1; + /* True if function was created to be executed in parallel. */ + unsigned parallelized_function : 1; private: /* Worker for call_for_symbol_and_aliases. */ diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c index c875fed..088de86 100644 --- a/gcc/lto-cgraph.c +++ b/gcc/lto-cgraph.c @@ -574,6 +574,7 @@ lto_output_node (struct lto_simple_output_block *ob, struct cgraph_node *node, bp_pack_value (&bp, node->icf_merged, 1); bp_pack_value (&bp, node->nonfreeing_fn, 1); bp_pack_value (&bp, node->thunk.thunk_p, 1); + bp_pack_value (&bp, node->parallelized_function, 1); bp_pack_enum (&bp, ld_plugin_symbol_resolution, LDPR_NUM_KNOWN, node->resolution); bp_pack_value (&bp, node->instrumentation_clone, 1); @@ -1209,6 +1210,7 @@ input_overwrite_node (struct lto_file_decl_data *file_data, node->icf_merged = bp_unpack_value (bp, 1); node->nonfreeing_fn = bp_unpack_value (bp, 1); node->thunk.thunk_p = bp_unpack_value (bp, 1); + node->parallelized_function = bp_unpack_value (bp, 1); node->resolution = bp_unpack_enum (bp, ld_plugin_symbol_resolution, LDPR_NUM_KNOWN); node->instrumentation_clone = bp_unpack_value (bp, 1); diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 48d73cb..5ca9e84 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -5569,6 +5569,7 @@ expand_omp_taskreg (struct omp_region *region) /* Inform the callgraph about the new function. */ DECL_STRUCT_FUNCTION (child_fn)->curr_properties = cfun->curr_properties; cgraph_node::add_new_function (child_fn, true); + cgraph_node::get (child_fn)->parallelized_function = 1; /* Fix the callgraph edges for child_cfun. Those for cfun will be fixed in a following pass. */ diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c index a584460..62a6444 100644 --- a/gcc/tree-parloops.c +++ b/gcc/tree-parloops.c @@ -75,6 +75,9 @@ along with GCC; see the file COPYING3. If not see #include "tree-parloops.h" #include "omp-low.h" #include "tree-nested.h" +#include "plugin-api.h" +#include "ipa-ref.h" +#include "cgraph.h" /* This pass tries to distribute iterations of loops into several threads. The implementation is straightforward -- for each loop we test whether its @@ -1422,21 +1425,14 @@ separate_decls_in_region (edge entry, edge exit, } } -/* Bitmap containing uids of functions created by parallelization. We cannot - allocate it from the default obstack, as it must live across compilation - of several functions; we make it gc allocated instead. */ - -static GTY(()) bitmap parallelized_functions; - -/* Returns true if FN was created by create_loop_fn. */ +/* Returns true if FN was created to run in parallel. */ bool -parallelized_function_p (tree fn) +parallelized_function_p (tree fndecl) { - if (!parallelized_functions || !DECL_ARTIFICIAL (fn)) -return false; - - return bitmap_bit_p (parallelized_functions, DECL_UID (fn)); + cgraph_node *node = cgraph_node::get (fndecl); + gcc_assert (n
Re: [PATCH] Speed-up def_builtin_const (ix86_valid_target_attribute)
On Fri, Mar 20, 2015 at 11:33:09AM +0100, Martin Liška wrote: > @@ -30670,6 +30673,20 @@ def_builtin_const (HOST_WIDE_INT mask, const char > *name, > static void > ix86_add_new_builtins (HOST_WIDE_INT isa) > { > + /* Last cached isa value. */ > + static HOST_WIDE_INT last_tested_isa_value = 0; > + > + /* We iterate through all defined builtins just if the last tested > + values is different from ISA and just in case there is any intersection > + between ISA value and union of all possible configurations. > + Last condition skips iterations if ISA is changed by the change has > + empty intersection with defined_isa_values. */ > + if ((isa & defined_isa_values) == 0 || isa == last_tested_isa_value > + || ((isa ^ last_tested_isa_value) & defined_isa_values) == 0) > +return; > + > + last_tested_isa_value = isa; Isn't at least the isa == last_tested_isa_value test useless? I mean, if they are equal, then (isa ^ last_tested_isa_value) is 0 and so the third test is true. Also, given that the loop does something only for (ix86_builtins_isa[i].isa & isa) != 0 it means once you call ix86_add_new_builtins with some particular bit set in the isa, it doesn't make sense to try that bit again. So, I think it would be better: 1) rename defined_isa_values bitmask to say deferred_isa_values (as in, isa bits that might still enable any new builtins) 2) in def_builtin, don't or in the mask unconditionally, but only in the else case - when add_builtin_function is not called 3) in ix86_add_new_builtins, don't add last_tested_isa_value at all, instead do: if ((isa & deferred_isa_values) == 0) return; deferred_isa_values &= ~isa; That way, when you actually enable all builtins (either immediately in def_builtin because it wasn't C/C++-like FE, or because all ISAs were enabled from the start, or because ix86_add_new_builtins has been already called with sufficiently full bitmask), deferred_isa_values will be 0 and you won't do anything in the function any more. Jakub
Re: [PATCH] Speed-up def_builtin_const (ix86_valid_target_attribute)
On 03/20/2015 01:10 AM, Jan Hubicka wrote: Hi. Explanation of the patch is introduced. thanks mask &= ~OPTION_MASK_ISA_64BIT; if (mask == 0 @@ -30670,6 +30673,14 @@ def_builtin_const (HOST_WIDE_INT mask, const char *name, static void ix86_add_new_builtins (HOST_WIDE_INT isa) { + /* Last cached isa value. */ + static HOST_WIDE_INT last_tested_isa_value = 0; + + if ((isa & defined_isa_values) == 0 || isa == last_tested_isa_value) Heer you need to compare (isa & defined_isa_values) == (isa & last_tested_isa_value) right, because we have isa flags that enable no builtins. I do not understand why, the guard simply ignores last value, which is already processed and 'isa' with any intersection with defined_isa_values. Maybe I miss something? I think you can also skip processing in cases ISA changed but the change has empty intersection with defined_isa_values. Honza Good idea. I've integrated this part to upgraded patch, which can survive regression tests on x86_64-linux-pc and can bootstrap. Ready for trunk? Thanks, Martin >From 5df909a7d06e91c9506a7d7186015b431f9db04a Mon Sep 17 00:00:00 2001 From: marxin Date: Sun, 8 Mar 2015 19:39:55 -0500 Subject: [PATCH] def_builtin_const: speed-up. gcc/ChangeLog: 2015-03-09 Martin Liska * config/i386/i386.c (def_builtin): Collect union of all possible masks. (ix86_add_new_builtins): Do not iterate over all builtins in cases that isa value has no intersection with possible masks and(or) last passed value is equal to the provided. --- gcc/config/i386/i386.c | 17 + 1 file changed, 17 insertions(+) diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index ab8f03a..afb2fa4 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -30592,6 +30592,8 @@ struct builtin_isa { static struct builtin_isa ix86_builtins_isa[(int) IX86_BUILTIN_MAX]; +/* Union of all masks that are part of builtin_isa structures. */ +static HOST_WIDE_INT defined_isa_values = 0; /* Add an ix86 target builtin function with CODE, NAME and TYPE. Save the MASK of which isa_flags to use in the ix86_builtins_isa array. Stores the @@ -30619,6 +30621,7 @@ def_builtin (HOST_WIDE_INT mask, const char *name, if (!(mask & OPTION_MASK_ISA_64BIT) || TARGET_64BIT) { ix86_builtins_isa[(int) code].isa = mask; + defined_isa_values |= mask; mask &= ~OPTION_MASK_ISA_64BIT; if (mask == 0 @@ -30670,6 +30673,20 @@ def_builtin_const (HOST_WIDE_INT mask, const char *name, static void ix86_add_new_builtins (HOST_WIDE_INT isa) { + /* Last cached isa value. */ + static HOST_WIDE_INT last_tested_isa_value = 0; + + /* We iterate through all defined builtins just if the last tested + values is different from ISA and just in case there is any intersection + between ISA value and union of all possible configurations. + Last condition skips iterations if ISA is changed by the change has + empty intersection with defined_isa_values. */ + if ((isa & defined_isa_values) == 0 || isa == last_tested_isa_value + || ((isa ^ last_tested_isa_value) & defined_isa_values) == 0) +return; + + last_tested_isa_value = isa; + int i; tree saved_current_target_pragma = current_target_pragma; current_target_pragma = NULL_TREE; -- 2.1.2
RE: [PATCH, GCC, stage1] Fallback to copy-prop if constant-prop not possible
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- > ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme > > I noticed in do_local_cprop you replace >= FIRST_PSEUDO_REGISTER by > cprop_reg_p without removing the REG_P as well. Sorry, I missed the parenthesis. REG_P needs indeed to be kept. I'd be tempted to use !HARD_REGISTER_P instead since REG_P is already checked but I don't mind either way. Best regards, Thomas
Fix PR ada/65451
This is a regression present on the mainline: gigi gets confused translating a volatile access type whose designated type has its elaboration delayed. Fixed thusly, tested on x86_64-suse-linux, applied on the mainline. 2015-03-20 Eric Botcazou PR ada/65451 * gcc-interface/utils.c (gnat_pushdecl): Tidy up and improve comment. Make sure to chain only main variants through TYPE_NEXT_PTR_TO. 2015-03-20 Eric Botcazou * gnat.dg/volatile13.ad[sb]: New test. -- Eric BotcazouIndex: gcc-interface/utils.c === --- gcc-interface/utils.c (revision 221457) +++ gcc-interface/utils.c (working copy) @@ -787,19 +787,17 @@ gnat_pushdecl (tree decl, Node_Id gnat_n { /* Array types aren't "tagged" types so we force the type to be associated with its typedef in the DWARF back-end, in order to - make sure that the latter is always preserved. We used to do the - same for pointer types, but to have consistent DWARF output we now - create copies for DECL_ORIGINAL_TYPE just like the C front-end - does in c-common.c:set_underlying_type. */ + make sure that the latter is always preserved, by creating an + on-side copy for DECL_ORIGINAL_TYPE. We used to do the same + for pointer types, but to have consistent DWARF output we now + create a copy for the type itself and use the original type + for DECL_ORIGINAL_TYPE like the C front-end. */ if (!DECL_ARTIFICIAL (decl) && TREE_CODE (t) == ARRAY_TYPE) { tree tt = build_distinct_type_copy (t); - if (TREE_CODE (t) == POINTER_TYPE) - TYPE_NEXT_PTR_TO (t) = tt; - /* Array types need to have a name so that they can be related to - their GNAT encodings. */ - if (TREE_CODE (t) == ARRAY_TYPE) - TYPE_NAME (tt) = DECL_NAME (decl); + /* Array types need to have a name so that they can be related + to their GNAT encodings. */ + TYPE_NAME (tt) = DECL_NAME (decl); defer_or_set_type_context (tt, DECL_CONTEXT (decl), deferred_decl_context); @@ -811,13 +809,17 @@ gnat_pushdecl (tree decl, Node_Id gnat_n && (TREE_CODE (t) == POINTER_TYPE || TYPE_IS_FAT_POINTER_P (t))) { tree tt; - /* ??? We need a variant for the placeholder machinery to work. */ + /* ??? Copy and original type are not supposed to be variant but we + really need a variant for the placeholder machinery to work. */ if (TYPE_IS_FAT_POINTER_P (t)) tt = build_variant_type_copy (t); else - tt = build_distinct_type_copy (t); - if (TREE_CODE (t) == POINTER_TYPE) - TYPE_NEXT_PTR_TO (t) = tt; + { + /* TYPE_NEXT_PTR_TO is a chain of main variants. */ + tt = build_distinct_type_copy (TYPE_MAIN_VARIANT (t)); + TYPE_NEXT_PTR_TO (TYPE_MAIN_VARIANT (t)) = tt; + tt = build_qualified_type (tt, TYPE_QUALS (t)); + } TYPE_NAME (tt) = decl; defer_or_set_type_context (tt, DECL_CONTEXT (decl),-- { dg-do compile } package body Volatile13 is procedure Compute_Index_Map (Self : Shared_String) is Map : Index_Map_Access := Self.Index_Map; begin Map := new Index_Map (Self.Length); end; end Volatile13;package Volatile13 is type Index_Map (Length : Natural) is record Map : String (1 .. Length); end record; type Index_Map_Access is access all Index_Map; pragma Volatile (Index_Map_Access); type Shared_String (Size : Natural) is limited record Length: Natural := 0; Index_Map : Index_Map_Access := null; end record; Shared_Empty : Shared_String := (Size => 64, others => <>); procedure Compute_Index_Map (Self : Shared_String); end Volatile13;
Re: [PATCH] Fix warnings from including fdl.texi into gnat-style.texi
On 22-02-15 18:37, Arnaud Charlet wrote: While generating gnat-style.info, we see these warnings: ... src/gcc/doc/include/fdl.texi:33: warning: node `Index' is next for `GNU Free Documentation License' in menu but not in sectioning src/gcc/doc/include/fdl.texi:33: warning: node `Top' is up for `GNU Free Documentation License' in menu but not in sectioning ... Attached patch fixes these. Ok for stage4? The gnat-style.texi part is OK. I cannot approve the fdl part though. Gerald, Can you approve the fdl part? Thanks, - Tom
Re: [PATCH] Run DCE after if conversion
On Fri, Mar 20, 2015 at 9:50 AM, Andreas Krebbel wrote: > On 03/18/2015 12:04 PM, Richard Biener wrote: >> On Tue, Mar 17, 2015 at 7:29 PM, Jeff Law wrote: >>> On 03/17/2015 02:17 AM, Andreas Krebbel wrote: Just to have some numbers I did run a -j1 GCC bootstrap twice with and without the patch on x86_64. Best results for both are: clean: 21459s patched: 21314s There rather appears to be a trend towards reduced compile time perhaps due to the reduced number of INSNs to be processed in the RTL passes between the two ifcvt runs (loop optimization, combine, fwprop, dse,...)?! I also tried to measure the testsuite runs but the results show a big variance. So what I have right now does not qualify as a benchmark. >>> >>> And reality is it's getting harder and harder to benchmark this kind of >>> thing with turbo modes and such.A single run isn't sufficient unless >>> you've locked the box into a particular cpu frequency. >> >> For the particular patch I wonder if you really need to change all >> three if-conversion pass instances or if changing the one before >> combine (pass_rtl_ifcvt, thus rest_of_handle_if_conversion) is enough. > Right. For this particular case it would be good enough to do it only in the > first ifcvt run. But > perhaps there are cases where later passes get confused by the leftovers from > ifcvt? Well, "perhaps" ... >> That already runs an unconditonal (huh...) cleanup_cfg (0) at the end >> which could be changed so that DCE is performed (CLEANUP_EXPENSIVE, >> runs delete_trivially_dead_insns). >> >> At least that makes the patch smaller and its impact restricted to >> one of the three ifcvt passes. > This does not seem to work. The DCE run by cleanup_cfg only deals with dead > pseudos. In my case it > is a set to the CC hard reg which becomes dead. I see. >> OTOH ifcvt performs a DCE at its start (to be not confused by dead >> instructions I guess), so why doesn't combine do that as well >> (oh, it does!?)? > When reaching combine the LR solution is clean and therefore also DCE isn't > performed. This is > because ifcvt disables running DCE with LR. So LR is always clean after ifcvt > although there are > still dead insns left. Ok... >> And maybe _that_ DCE can be removed as if_convert () >> already performs a DF_LR_RUN_DCE on the first pass. > You mean removing the DCE run in combine? That one probably can go away then > given that the passes > running between ifcvt and combine (loop and fwprop) get rid of dead insns > properly. So I think the best solution is still removing the dead code manually in ifcvt. Then there is also pass_ud_rtl_dce run before the combine pass - if that isn't enough because of clean solutions or whatever then that should be strenghthened - it was most definitely designed to fix the issue you see. Richard. > -Andreas- > >> >> Richard. >> >>> jeff >>> >> >
Re: [PATCH] Fix fdump-passes
On Thu, Mar 19, 2015 at 11:40 PM, Tom de Vries wrote: > [ was: Re: [PATCH, stage1] Make parloops gate more strict ] > On 19-03-15 10:00, Richard Biener wrote: Yeah - it makes the -fdump-passes "hack" more pervasive throughout >>the compiler. >> >>I suppose it should instead build & push a "dummy" sturct function. >> >>> >>> > >>> >Like this? >> >> Looks good to me. > > > > Added ChangeLog entry, bootstrapped and reg-tested on x86_64. > > OK for stage1 (or stage4) trunk? Ok for stage1. Thanks, Richard. > Thanks, > - Tom
Re: Fix PR 65177: diamonds are not valid execution threads for jump threading
On Thu, Mar 19, 2015 at 8:54 PM, Sebastian Pop wrote: > Richard Biener wrote: >> please instead fixup after copy_bbs in duplicate_seme_region. >> > > Thanks for the review. > Attached patch that does not modify copy_bbs. > Fixes make check in hmmer and make check RUNTESTFLAGS=tree-ssa.exp > > Full bootstrap and regtest in progress on x86_64-linux. Ok for trunk? Looks good to me. Please also add a testcase. Thanks, Richard.
Re: [PATCH] Fix thunk expansion (PR ipa/64896)
Ping. On 11 March 2015 at 16:38, Yvan Roux wrote: > Hi, > > PR ipa/65236 * cgraphunit.c (cgraph_node::expand_thunk): Enable return slot opt. >> >> This bugfix adds ipa-icf-6.C test which failed on 4.9 branch as >> ipa-icf is not backported on that branch. Is the bugfix still >> relevant and we can dropped the testcase ? >> PR ipa/64813 * cgraphunit.c (cgraph_node::expand_thunk): Do not create a return value for call to a function that is noreturn. PR ipa/63595 * cgraphunit.c (cgraph_node::expand_thunk): DECL_BY_REFERENCE is correctly handled for thunks created by IPA ICF. PR ipa/63587 * cgraphunit.c (cgraph_node::expand_thunk): Only VAR_DECLs are put to local declarations. * function.c (add_local_decl): Implementation moved from header file, assert introduced for tree type. * function.h: Likewise. > > Here is the two patches that backport PR ipa/63587 and PR ipa/64813 > fixes in 4.9 branch. The 2 others introduce test cases that check > ipa-icf pass dumps, so I'm not sure if the code has to be backported. > > bootstrapped/regtested on x86_64 and cross-compiled/regtested on > aarch64-linux-gnu > arm-linux-gnueabihf > armeb-linux-gnueabihf > i686-linux-gnu > > Ok for 4.9 ? > > Thanks > Yvan > > - PR 63587 - > gcc/ > 2015-03-11 Yvan Roux > > Backport from trunk r216841. > 2014-10-29 Martin Liska > > PR ipa/63587 > * cgraphunit.c (cgraph_node::expand_thunk): Only VAR_DECLs are put > to local declarations. > * function.c (add_local_decl): Implementation moved from header > file, assert introduced for tree type. > * function.h: Likewise. > > gcc/testsuite/ > 2015-03-11 Yvan Roux > > Backport from trunk r216841. > 2014-10-29 Martin Liska > > PR ipa/63587 > * g++.dg/ipa/pr63587-1.C: New test. > * g++.dg/ipa/pr63587-2.C: New test. > > - PR 64813 - > 2015-03-11 Yvan Roux > > Backport from trunk r220616. > 2015-02-11 Martin Liska > > PR ipa/64813 > * cgraphunit.c (cgraph_node::expand_thunk): Do not create > a return value for call to a function that is noreturn.
Re: [PATCH] Run DCE after if conversion
On 03/18/2015 12:04 PM, Richard Biener wrote: > On Tue, Mar 17, 2015 at 7:29 PM, Jeff Law wrote: >> On 03/17/2015 02:17 AM, Andreas Krebbel wrote: >>> >>> >>> Just to have some numbers I did run a -j1 GCC bootstrap twice with and >>> without the patch on x86_64. >>> Best results for both are: >>> >>> clean: 21459s >>> patched: 21314s >>> >>> There rather appears to be a trend towards reduced compile time perhaps >>> due to the reduced number of >>> INSNs to be processed in the RTL passes between the two ifcvt runs (loop >>> optimization, combine, >>> fwprop, dse,...)?! >>> >>> I also tried to measure the testsuite runs but the results show a big >>> variance. So what I have right >>> now does not qualify as a benchmark. >> >> And reality is it's getting harder and harder to benchmark this kind of >> thing with turbo modes and such.A single run isn't sufficient unless >> you've locked the box into a particular cpu frequency. > > For the particular patch I wonder if you really need to change all > three if-conversion pass instances or if changing the one before > combine (pass_rtl_ifcvt, thus rest_of_handle_if_conversion) is enough. Right. For this particular case it would be good enough to do it only in the first ifcvt run. But perhaps there are cases where later passes get confused by the leftovers from ifcvt? > That already runs an unconditonal (huh...) cleanup_cfg (0) at the end > which could be changed so that DCE is performed (CLEANUP_EXPENSIVE, > runs delete_trivially_dead_insns). > > At least that makes the patch smaller and its impact restricted to > one of the three ifcvt passes. This does not seem to work. The DCE run by cleanup_cfg only deals with dead pseudos. In my case it is a set to the CC hard reg which becomes dead. > OTOH ifcvt performs a DCE at its start (to be not confused by dead > instructions I guess), so why doesn't combine do that as well > (oh, it does!?)? When reaching combine the LR solution is clean and therefore also DCE isn't performed. This is because ifcvt disables running DCE with LR. So LR is always clean after ifcvt although there are still dead insns left. > And maybe _that_ DCE can be removed as if_convert () > already performs a DF_LR_RUN_DCE on the first pass. You mean removing the DCE run in combine? That one probably can go away then given that the passes running between ifcvt and combine (loop and fwprop) get rid of dead insns properly. -Andreas- > > Richard. > >> jeff >>> >>> >> >
RE: [PATCH, GCC, stage1] Fallback to copy-prop if constant-prop not possible
Hi Steven, > From: Steven Bosscher [mailto:stevenb@gmail.com] > Sent: Friday, March 20, 2015 3:54 PM > > > What I meant, is that I believe the tests are already done in > hash_scan_set and should be redundant in cprop_insn (i.e. the test can > be replaced with gcc_[checking_]assert). Ok. > > I've attached a patch with some changes to it: introduce cprop_reg_p() > to get rid of all the "REG_P && regno > FIRST_PSEUDO_REGISTER" tests. > I still have the cprop_constant_p and cprop_reg_p tests in cprop_insn > but this weekend I'll try with gcc_checking_asserts instead. Please > have a look at the patch and let me know if you like it (given it's > mostly yours I hope you do like it ;-) I think it would be preferable to introduce PSEUDO_REG_P in rtl.h as this seems like a common pattern enough [1]. It would be nice to have a HARD_REG_P that would be cover the other common patterns REG_P && < FIRST_PSEUDO_REGISTER and REG_P && HARD_REGISTER_P but I can't come up with a good name (HARD_REGISTER_P is confusing because it doesn't check if it's a register in the first place). I noticed in do_local_cprop you replace >= FIRST_PSEUDO_REGISTER by cprop_reg_p without removing the REG_P as well. In implicit_set_cond_p there is a replacement of !REG_P || HARD_REGISTER_P by cprop_reg_p. It seems to me it should rather be replaced by !cprop_reg_p. Otherwise it looks ok. [1] grep -R "REG_P .*&&.*>= FIRST_PSEUDO_REGISTER" . | wc -l returns 23 > > Bootstrapped & tested on powerpc64-unknown-linux-gnu. In building all > of cc1, 35 extra copies are propagated with the patch. I'll try to launch a build and testsuite run with these 2 issues fixed before I leave tonight and will report the result on Monday. Best regards, Thomas
[PATCH, CHKP] Restore transparent alias chains
Hi, Identifiers read with input_identifier miss IDENTIFIER_TRANSPARENT_ALIAS bit. We always expect it for instrumentation clones, thus restore it input_cgraph_1. Bootstrapped and tested on x86_64-unknown-linux-gnu. OK for trunk? Thanks, Ilya -- 2015-03-20 Ilya Enkovich * lto-cgraph.c (input_cgraph_1): Always link instrumented assembler name with original one. diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c index c875fed..d782327 100644 --- a/gcc/lto-cgraph.c +++ b/gcc/lto-cgraph.c @@ -1613,9 +1613,8 @@ input_cgraph_1 (struct lto_file_decl_data *file_data, } /* Restore decl names reference. */ - if (IDENTIFIER_TRANSPARENT_ALIAS (DECL_ASSEMBLER_NAME (cnode->decl)) - && !TREE_CHAIN (DECL_ASSEMBLER_NAME (cnode->decl))) - TREE_CHAIN (DECL_ASSEMBLER_NAME (cnode->decl)) + IDENTIFIER_TRANSPARENT_ALIAS (DECL_ASSEMBLER_NAME (cnode->decl)) = 1; + TREE_CHAIN (DECL_ASSEMBLER_NAME (cnode->decl)) = DECL_ASSEMBLER_NAME (cnode->orig_decl); } }
Re: [PATCH, GCC, stage1] Fallback to copy-prop if constant-prop not possible
On Tue, Feb 17, 2015 at 3:51 AM, Thomas Preud'homme wrote: >> > - else if (REG_P (src) >> > - && REGNO (src) >= FIRST_PSEUDO_REGISTER >> > - && REGNO (src) != regno) >> > - { >> > - if (try_replace_reg (reg_used, src, insn)) >> > + else if (src_reg && REG_P (src_reg) >> > + && REGNO (src_reg) >= FIRST_PSEUDO_REGISTER >> > + && REGNO (src_reg) != regno >> > + && try_replace_reg (reg_used, src_reg, insn)) >> >> Likewise for the REG_P and ">= FIRST_PSEUDO_REGISTER" tests here >> (with >> the equivalent and IMHO preferable HARD_REGISTER_P test in >> find_avail_set()). > > I'm not sure I follow you here. First, it seems to me that the equivalent > test is rather REG_P && !HARD_REGISTER_P since here it checks if it's > a pseudo register. > > Then, do you mean the test can be simply removed because of the > REG_P && !HARD_REGISTER_P in hash_scan_set () called indirectly by > compute_hash_table () when called in one_cprop_pass () before any > cprop_insn ()? Or do you mean I should move the check in > find_avail_set ()? What I meant, is that I believe the tests are already done in hash_scan_set and should be redundant in cprop_insn (i.e. the test can be replaced with gcc_[checking_]assert). I've attached a patch with some changes to it: introduce cprop_reg_p() to get rid of all the "REG_P && regno > FIRST_PSEUDO_REGISTER" tests. I still have the cprop_constant_p and cprop_reg_p tests in cprop_insn but this weekend I'll try with gcc_checking_asserts instead. Please have a look at the patch and let me know if you like it (given it's mostly yours I hope you do like it ;-) Bootstrapped & tested on powerpc64-unknown-linux-gnu. In building all of cc1, 35 extra copies are propagated with the patch. Ciao! Steven Index: cprop.c === --- cprop.c (revision 221523) +++ cprop.c (working copy) @@ -285,6 +285,15 @@ cprop_constant_p (const_rtx x) return CONSTANT_P (x) && (GET_CODE (x) != CONST || shared_const_p (x)); } +/* Determine whether the rtx X should be treated as a register that can + be propagated. Any pseudo-register is fine. */ + +static bool +cprop_reg_p (const_rtx x) +{ + return REG_P (x) && !HARD_REGISTER_P (x); +} + /* Scan SET present in INSN and add an entry to the hash TABLE. IMPLICIT is true if it's an implicit set, false otherwise. */ @@ -295,8 +304,7 @@ hash_scan_set (rtx set, rtx_insn *insn, struct has rtx src = SET_SRC (set); rtx dest = SET_DEST (set); - if (REG_P (dest) - && ! HARD_REGISTER_P (dest) + if (cprop_reg_p (dest) && reg_available_p (dest, insn) && can_copy_p (GET_MODE (dest))) { @@ -321,9 +329,8 @@ hash_scan_set (rtx set, rtx_insn *insn, struct has src = XEXP (note, 0), set = gen_rtx_SET (VOIDmode, dest, src); /* Record sets for constant/copy propagation. */ - if ((REG_P (src) + if ((cprop_reg_p (src) && src != dest - && ! HARD_REGISTER_P (src) && reg_available_p (src, insn)) || cprop_constant_p (src)) insert_set_in_table (dest, src, insn, table, implicit); @@ -821,15 +828,15 @@ try_replace_reg (rtx from, rtx to, rtx_insn *insn) return success; } -/* Find a set of REGNOs that are available on entry to INSN's block. Return - NULL no such set is found. */ +/* Find a set of REGNOs that are available on entry to INSN's block. If found, + SET_RET[0] will be assigned a set with a register source and SET_RET[1] a + set with a constant source. If not found the corresponding entry is set to + NULL. */ -static struct cprop_expr * -find_avail_set (int regno, rtx_insn *insn) +static void +find_avail_set (int regno, rtx_insn *insn, struct cprop_expr *set_ret[2]) { - /* SET1 contains the last set found that can be returned to the caller for - use in a substitution. */ - struct cprop_expr *set1 = 0; + set_ret[0] = set_ret[1] = NULL; /* Loops are not possible here. To get a loop we would need two sets available at the start of the block containing INSN. i.e. we would @@ -869,8 +876,10 @@ try_replace_reg (rtx from, rtx to, rtx_insn *insn) If the source operand changed, we may still use it for the next iteration of this loop, but we may not use it for substitutions. */ - if (cprop_constant_p (src) || reg_not_set_p (src, insn)) - set1 = set; + if (cprop_constant_p (src)) + set_ret[1] = set; + else if (reg_not_set_p (src, insn)) + set_ret[0] = set; /* If the source of the set is anything except a register, then we have reached the end of the copy chain. */ @@ -881,10 +890,6 @@ try_replace_reg (rtx from, rtx to, rtx_insn *insn) and see if we have an available copy into SRC. */ regno = REGNO (src); } - - /* SET1 holds the last set that was available and anticipatable at -