Re: New modref/ipa_modref optimization passes
On Thu, 29 Oct 2020, Jan Hubicka wrote: > > Hi, > > this is patch I am using to fix the assumed_alias_type.f90 failure by > > simply arranging alias set 0 for the problematic array descriptor. > > > > I am not sure this is the best option, but I suppose it is better than > > setting all array descritors to have same canonical type (as done by > > LTO)? > > > Hi, > here is updated patch which used TYPELESS_STORAGE instead of alias set > 0, so it is LTO safe. Unforunately I also had to enable it for all > array descriptors otherwise I still get misopitmizations with modref > extended to handle bulitins, for example: > > FAIL: gfortran.dg/class_array_20.f03 -Os execution test > FAIL: gfortran.dg/coindexed_1.f90 -O2 execution test > FAIL: gfortran.dg/coindexed_1.f90 -O3 -fomit-frame-pointer > FAIL: gfortran.dg/coindexed_1.f90 -O3 -g execution test > > This is not a perfect solution (we really want to track array > descriptors), but it fixes wrong code and would let me to move forward. > Is it OK for mainline? > > With extended modref I still get infinite loop on pdt_14 testcase. > ipa-modref only performs disambiguation on > __vtab_link_module_Pdtlink_8._deallocate this global variable is > readonly (and is detected as such with LTO) so it must be just > uncovering some latent problem there. I am however not familiar enough > with Fortran to tell what is wrong there. > > The testcase fail different way with -flto for me. > > Bootstrapped/regtested x86_64-linux, OK? OK. Thanks, Richard. > Honza > > * trans-types.c: Include alias.h > (gfc_get_array_type_bounds): Set typeless storage. > diff --git a/gcc/fortran/trans-types.c b/gcc/fortran/trans-types.c > index b15ea667411..b7129dcbe6d 100644 > --- a/gcc/fortran/trans-types.c > +++ b/gcc/fortran/trans-types.c > @@ -38,6 +38,7 @@ along with GCC; see the file COPYING3. If not see > #include "trans-array.h" > #include "dwarf2out.h" /* For struct array_descr_info. */ > #include "attribs.h" > +#include "alias.h" > > > #if (GFC_MAX_DIMENSIONS < 10) > @@ -1903,6 +1904,10 @@ gfc_get_array_type_bounds (tree etype, int dimen, int > codimen, tree * lbound, >base_type = gfc_get_array_descriptor_base (dimen, codimen, false); >TYPE_CANONICAL (fat_type) = base_type; >TYPE_STUB_DECL (fat_type) = TYPE_STUB_DECL (base_type); > + /* Arrays of unknown type must alias with all array descriptors. */ > + TYPE_TYPELESS_STORAGE (base_type) = 1; > + TYPE_TYPELESS_STORAGE (fat_type) = 1; > + gcc_checking_assert (!get_alias_set (base_type) && !get_alias_set > (fat_type)); > >tmp = TYPE_NAME (etype); >if (tmp && TREE_CODE (tmp) == TYPE_DECL) >
Re: New modref/ipa_modref optimization passes
> Hi, > this is patch I am using to fix the assumed_alias_type.f90 failure by > simply arranging alias set 0 for the problematic array descriptor. > > I am not sure this is the best option, but I suppose it is better than > setting all array descritors to have same canonical type (as done by > LTO)? > Hi, here is updated patch which used TYPELESS_STORAGE instead of alias set 0, so it is LTO safe. Unforunately I also had to enable it for all array descriptors otherwise I still get misopitmizations with modref extended to handle bulitins, for example: FAIL: gfortran.dg/class_array_20.f03 -Os execution test FAIL: gfortran.dg/coindexed_1.f90 -O2 execution test FAIL: gfortran.dg/coindexed_1.f90 -O3 -fomit-frame-pointer FAIL: gfortran.dg/coindexed_1.f90 -O3 -g execution test This is not a perfect solution (we really want to track array descriptors), but it fixes wrong code and would let me to move forward. Is it OK for mainline? With extended modref I still get infinite loop on pdt_14 testcase. ipa-modref only performs disambiguation on __vtab_link_module_Pdtlink_8._deallocate this global variable is readonly (and is detected as such with LTO) so it must be just uncovering some latent problem there. I am however not familiar enough with Fortran to tell what is wrong there. The testcase fail different way with -flto for me. Bootstrapped/regtested x86_64-linux, OK? Honza * trans-types.c: Include alias.h (gfc_get_array_type_bounds): Set typeless storage. diff --git a/gcc/fortran/trans-types.c b/gcc/fortran/trans-types.c index b15ea667411..b7129dcbe6d 100644 --- a/gcc/fortran/trans-types.c +++ b/gcc/fortran/trans-types.c @@ -38,6 +38,7 @@ along with GCC; see the file COPYING3. If not see #include "trans-array.h" #include "dwarf2out.h" /* For struct array_descr_info. */ #include "attribs.h" +#include "alias.h" #if (GFC_MAX_DIMENSIONS < 10) @@ -1903,6 +1904,10 @@ gfc_get_array_type_bounds (tree etype, int dimen, int codimen, tree * lbound, base_type = gfc_get_array_descriptor_base (dimen, codimen, false); TYPE_CANONICAL (fat_type) = base_type; TYPE_STUB_DECL (fat_type) = TYPE_STUB_DECL (base_type); + /* Arrays of unknown type must alias with all array descriptors. */ + TYPE_TYPELESS_STORAGE (base_type) = 1; + TYPE_TYPELESS_STORAGE (fat_type) = 1; + gcc_checking_assert (!get_alias_set (base_type) && !get_alias_set (fat_type)); tmp = TYPE_NAME (etype); if (tmp && TREE_CODE (tmp) == TYPE_DECL)
Re: New modref/ipa_modref optimization passes
On Fri, 23 Oct 2020 11:54:08 +0200 Bernhard Reutner-Fischer via Fortran wrote: > On 16 October 2020 11:20:23 CEST, Richard Biener wrote: > > >IMHO the cleanest way would be to swap the CAF token field and > >the dim[] field (which is an ABI change for -fcoarray) > > I think coarrays are new anyway so I suppose an ABI break is fine? Coarrays are in the standard since Fortran 2008. So what I'd rather not call them new being there for more than 10 years... The descriptor is used in the opencoarrays library, too. And has to be kept in sync. So when the ABI break is reasonable it's fine. -- Andre Vehreschild * Email: vehre ad gmx dot de
Re: New modref/ipa_modref optimization passes
On 16 October 2020 11:20:23 CEST, Richard Biener wrote: >IMHO the cleanest way would be to swap the CAF token field and >the dim[] field (which is an ABI change for -fcoarray) I think coarrays are new anyway so I suppose an ABI break is fine?
Re: New modref/ipa_modref optimization passes
On Fri, 16 Oct 2020, Richard Biener wrote: > On Fri, 16 Oct 2020, Richard Biener wrote: > > > On Fri, 16 Oct 2020, Jan Hubicka wrote: > > > > > Hi, > > > I am slowly getting finished with the fn spec changes on trunk and then > > > would like to proceed with modref. Sadly still get the assumed_type > > > failuere and in addition to that: > > > FAIL: gfortran.dg/finalize_25.f90 -O2 execution test > > > FAIL: gfortran.dg/finalize_25.f90 -O3 -fomit-frame-pointer > > > -funroll-loops -fpeel-loops -ftracer -finline-functions execution test > > > FAIL: gfortran.dg/finalize_25.f90 -O3 -g execution test > > > FAIL: gfortran.dg/finalize_25.f90 -Os execution test > > > WARNING: gfortran.dg/pdt_14.f03 -O2 execution test program timed out. > > > FAIL: gfortran.dg/pdt_14.f03 -O2 execution test > > > WARNING: gfortran.dg/pdt_14.f03 -O3 -fomit-frame-pointer -funroll-loops > > > -fpeel-loops -ftracer -finline-functions execution test > > > program timed out. > > > FAIL: gfortran.dg/pdt_14.f03 -O3 -fomit-frame-pointer -funroll-loops > > > -fpeel-loops -ftracer -finline-functions execution test > > > WARNING: gfortran.dg/pdt_14.f03 -O3 -g execution test program timed > > > out. > > > FAIL: gfortran.dg/pdt_14.f03 -O3 -g execution test > > > WARNING: gfortran.dg/pdt_14.f03 -Os execution test program timed out. > > > FAIL: gfortran.dg/pdt_14.f03 -Os execution test > > > > > > I wonder if there is any chance to get Fortran FE fixed here? > > > > OK, I'll try doing a little surgery in the FE today, coming up with > > a little refactoring and a fix along your original one that allows > > for a better one by FE folks. > > So I've sent a refactoring patch improving the tree building code. > > But now trying to fix the actual issue with the idea to perform > accesses indirectly via a descriptor with dim[] type I see that > the coarray 'token' field is appended to descriptors conditional > on -fcoarray and that this field makes the dim[] array no longer > trailing - which means the offset of the 'token' field depends > on the rank of the array. > > The dim[] field is even optional when dim + codimen == 0 and that > case indeed happens (ah, via get_scalar_to_descriptor_type). > So much for re-using this combo ;) > > I suppose we can compensate for this by dynamically computing the > offset of the 'token' field but then it's not obvious to me > where the total 'rank' is stored inside the descriptor or how > the 'token' field is accessed for assumed-shape arrays - the > current method by simple field chaining definitely won't work. > > Anyway, I'll try to deal with all this by just adjusting the TBAA > type but not the access type leaving that alone. > > IMHO the cleanest way would be to swap the CAF token field and > the dim[] field (which is an ABI change for -fcoarray) OK, so I tried diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c index f30a2f75701..29381f6756e 100644 --- a/gcc/fortran/trans-array.c +++ b/gcc/fortran/trans-array.c @@ -142,6 +142,17 @@ gfc_get_descriptor_field (tree desc, unsigned field_idx) tree field = gfc_advance_chain (TYPE_FIELDS (type), field_idx); gcc_assert (field != NULL_TREE); + /* We need to use a consistent descriptor type across all accesses + which should be possible by indirectly accessing the descriptor + via a type with a trailing flexible array dim[] member if there + were not the CAF token field after it. So for now ensure correct + TBAA behavior by explicitely specifying a common TBAA type - any + descriptor-like type is OK here. */ + tree tbaa_type += build_pointer_type (gfc_get_array_descriptor_base (4, 2, false)); + desc = fold_build2_loc (input_location, MEM_REF, type, + build_fold_addr_expr_loc (input_location, desc), + build_int_cst (tbaa_type, 0)); return fold_build3_loc (input_location, COMPONENT_REF, TREE_TYPE (field), desc, field, NULL_TREE); } which first reveals two spots missed by the sent refactoring and second exposes the fact that we use aggregate copies to copy array descritors or aggregates with array descriptor typed fields. There's currently no way to assign a custom TBAA type to aggregate fields which means that the only choice is to fix things at the above central place is using alias-set zero (and hoping for the embedded/aggregate copied places to match up). In the end this means that the optimal approach is to adjust only those accesses where we do not know the actual descriptor type but I expect those to be spread out? Eventually those cases could be identified above? Meanwhile the refactoring patch still applies of course, amended by adjustments to gfc_conv_descriptor_data_addr and gfc_conv_descriptor_data_set. Unfortunately punning the descriptor like above causes numerous testsuite FAILs due to orignal dump scannings no longer matching :/ So I'm hoping for a hint as to how to
Re: New modref/ipa_modref optimization passes
On Fri, 16 Oct 2020, Richard Biener wrote: > On Fri, 16 Oct 2020, Jan Hubicka wrote: > > > Hi, > > I am slowly getting finished with the fn spec changes on trunk and then > > would like to proceed with modref. Sadly still get the assumed_type > > failuere and in addition to that: > > FAIL: gfortran.dg/finalize_25.f90 -O2 execution test > > FAIL: gfortran.dg/finalize_25.f90 -O3 -fomit-frame-pointer -funroll-loops > > -fpeel-loops -ftracer -finline-functions execution test > > FAIL: gfortran.dg/finalize_25.f90 -O3 -g execution test > > FAIL: gfortran.dg/finalize_25.f90 -Os execution test > > WARNING: gfortran.dg/pdt_14.f03 -O2 execution test program timed out. > > FAIL: gfortran.dg/pdt_14.f03 -O2 execution test > > WARNING: gfortran.dg/pdt_14.f03 -O3 -fomit-frame-pointer -funroll-loops > > -fpeel-loops -ftracer -finline-functions execution test > > program timed out. > > FAIL: gfortran.dg/pdt_14.f03 -O3 -fomit-frame-pointer -funroll-loops > > -fpeel-loops -ftracer -finline-functions execution test > > WARNING: gfortran.dg/pdt_14.f03 -O3 -g execution test program timed out. > > FAIL: gfortran.dg/pdt_14.f03 -O3 -g execution test > > WARNING: gfortran.dg/pdt_14.f03 -Os execution test program timed out. > > FAIL: gfortran.dg/pdt_14.f03 -Os execution test > > > > I wonder if there is any chance to get Fortran FE fixed here? > > OK, I'll try doing a little surgery in the FE today, coming up with > a little refactoring and a fix along your original one that allows > for a better one by FE folks. So I've sent a refactoring patch improving the tree building code. But now trying to fix the actual issue with the idea to perform accesses indirectly via a descriptor with dim[] type I see that the coarray 'token' field is appended to descriptors conditional on -fcoarray and that this field makes the dim[] array no longer trailing - which means the offset of the 'token' field depends on the rank of the array. The dim[] field is even optional when dim + codimen == 0 and that case indeed happens (ah, via get_scalar_to_descriptor_type). So much for re-using this combo ;) I suppose we can compensate for this by dynamically computing the offset of the 'token' field but then it's not obvious to me where the total 'rank' is stored inside the descriptor or how the 'token' field is accessed for assumed-shape arrays - the current method by simple field chaining definitely won't work. Anyway, I'll try to deal with all this by just adjusting the TBAA type but not the access type leaving that alone. IMHO the cleanest way would be to swap the CAF token field and the dim[] field (which is an ABI change for -fcoarray) Richard.
Re: New modref/ipa_modref optimization passes
On Fri, 16 Oct 2020, Jan Hubicka wrote: > Hi, > I am slowly getting finished with the fn spec changes on trunk and then > would like to proceed with modref. Sadly still get the assumed_type > failuere and in addition to that: > FAIL: gfortran.dg/finalize_25.f90 -O2 execution test > FAIL: gfortran.dg/finalize_25.f90 -O3 -fomit-frame-pointer -funroll-loops > -fpeel-loops -ftracer -finline-functions execution test > FAIL: gfortran.dg/finalize_25.f90 -O3 -g execution test > FAIL: gfortran.dg/finalize_25.f90 -Os execution test > WARNING: gfortran.dg/pdt_14.f03 -O2 execution test program timed out. > FAIL: gfortran.dg/pdt_14.f03 -O2 execution test > WARNING: gfortran.dg/pdt_14.f03 -O3 -fomit-frame-pointer -funroll-loops > -fpeel-loops -ftracer -finline-functions execution test > program timed out. > FAIL: gfortran.dg/pdt_14.f03 -O3 -fomit-frame-pointer -funroll-loops > -fpeel-loops -ftracer -finline-functions execution test > WARNING: gfortran.dg/pdt_14.f03 -O3 -g execution test program timed out. > FAIL: gfortran.dg/pdt_14.f03 -O3 -g execution test > WARNING: gfortran.dg/pdt_14.f03 -Os execution test program timed out. > FAIL: gfortran.dg/pdt_14.f03 -Os execution test > > I wonder if there is any chance to get Fortran FE fixed here? OK, I'll try doing a little surgery in the FE today, coming up with a little refactoring and a fix along your original one that allows for a better one by FE folks. Richard.
Re: New modref/ipa_modref optimization passes
Hi, I am slowly getting finished with the fn spec changes on trunk and then would like to proceed with modref. Sadly still get the assumed_type failuere and in addition to that: FAIL: gfortran.dg/finalize_25.f90 -O2 execution test FAIL: gfortran.dg/finalize_25.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test FAIL: gfortran.dg/finalize_25.f90 -O3 -g execution test FAIL: gfortran.dg/finalize_25.f90 -Os execution test WARNING: gfortran.dg/pdt_14.f03 -O2 execution test program timed out. FAIL: gfortran.dg/pdt_14.f03 -O2 execution test WARNING: gfortran.dg/pdt_14.f03 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test program timed out. FAIL: gfortran.dg/pdt_14.f03 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test WARNING: gfortran.dg/pdt_14.f03 -O3 -g execution test program timed out. FAIL: gfortran.dg/pdt_14.f03 -O3 -g execution test WARNING: gfortran.dg/pdt_14.f03 -Os execution test program timed out. FAIL: gfortran.dg/pdt_14.f03 -Os execution test I wonder if there is any chance to get Fortran FE fixed here? Honza
Re: New modref/ipa_modref optimization passes
> On 9/21/20 10:10 AM, Richard Biener wrote: > > > > I see, so you would expect call to alsize to initialize things in > > > array15_unkonwn type? That would work too. > > Yes, that's my expectation. But let's see what fortran folks say. > > RFC patch attached; I think the following should work, but I am not > sure whether I missed something. > > I wonder what to do about > '!GCC$ NO_ARG_CHECK :: x > but that seems to work fine (creates void* type) and as it only > permits assumed size or scalar variables, the descriptor issue > does not occur. > > Thoughts? Hi, with somewhat improved ipa-modref and your patch i get following failures: FAIL: gfortran.dg/assumed_type_2.f90 -O scan-tree-dump-times original "sub_array_assumed ((struct t1.0:. .) parm" 1 FAIL: gfortran.dg/assumed_type_9.f90 -O2 execution test FAIL: gfortran.dg/assumed_type_9.f90 -Os execution test FAIL: gfortran.dg/class_allocate_20.f90 -O2 execution test FAIL: gfortran.dg/class_allocate_20.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test FAIL: gfortran.dg/class_allocate_20.f90 -O3 -g execution test FAIL: gfortran.dg/class_allocate_20.f90 -Os execution test FAIL: gfortran.dg/finalize_25.f90 -O2 execution test FAIL: gfortran.dg/finalize_25.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test FAIL: gfortran.dg/finalize_25.f90 -O3 -g execution test FAIL: gfortran.dg/finalize_25.f90 -Os execution test FAIL: gfortran.dg/no_arg_check_2.f90 -O scan-tree-dump-times original "sub_array_assumed ((struct t1.0:. .) parm" 1 WARNING: gfortran.dg/pdt_14.f03 -O2 execution test program timed out. FAIL: gfortran.dg/pdt_14.f03 -O2 execution test WARNING: gfortran.dg/pdt_14.f03 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test program timed out. FAIL: gfortran.dg/pdt_14.f03 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test WARNING: gfortran.dg/pdt_14.f03 -O3 -g execution test program timed out. FAIL: gfortran.dg/pdt_14.f03 -O3 -g execution test WARNING: gfortran.dg/pdt_14.f03 -Os execution test program timed out. FAIL: gfortran.dg/pdt_14.f03 -Os execution test FAIL: gfortran.dg/sizeof_4.f90 -O0 execution test FAIL: gfortran.dg/sizeof_4.f90 -O1 execution test FAIL: gfortran.dg/sizeof_4.f90 -O2 execution test FAIL: gfortran.dg/sizeof_4.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test FAIL: gfortran.dg/sizeof_4.f90 -O3 -g execution test FAIL: gfortran.dg/sizeof_4.f90 -Os execution test With asumed_type_9.f90 we get: __final_test_T4 (struct array15_t4 & restrict array, integer(kind=8) byte_stride, logical(kind=1) fini_coarray) called as: struct array00_t1 decl_uidesc.19 __final_test_T4 (, 24, 0); and we optimize out initializer of desc.19 since it is TBAA incompatible (so same problem as with assumed type but this time the consumer descriptor is not universal; just different). With finalize_25 I see: __final_gn_Sl (struct array15_sl & restrict array, integer(kind=8) byte_stride, logical(kind=1) fini_coarray) called as: struct array00_sl desc.20 ... __final_gn_Sl (, 64, 0); With pdf14_f03 I get disambiguation ipa-modref: in main/8, call to push_8/6 does not clobber __vtab_link_module_Pdtlink_8._deallocate 14->13 so this seems different and I am not quite sure what is wrong here. FAIL: gfortran.dg/sizeof_4.f90 -O1 execution test actually goes away with reverting your patch. Honza > > Tobias > > gcc/fortran/ChangeLog: > > * trans-array.c (gfc_conv_expr_descriptor): > (gfc_conv_array_parameter): > * trans-array.h (gfc_conv_expr_descriptor): > > gcc/fortran/trans-array.c | 15 +-- > gcc/fortran/trans-array.h | 3 ++- > 2 files changed, 11 insertions(+), 7 deletions(-) > > diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c > index 6566c47d4ae..a5d1b477a0a 100644 > --- a/gcc/fortran/trans-array.c > +++ b/gcc/fortran/trans-array.c > @@ -7216,7 +7216,7 @@ walk_coarray (gfc_expr *e) > function call. */ > > void > -gfc_conv_expr_descriptor (gfc_se *se, gfc_expr *expr) > +gfc_conv_expr_descriptor (gfc_se *se, gfc_expr *expr, bool want_assumed_type) > { >gfc_ss *ss; >gfc_ss_type ss_type; > @@ -7611,7 +7611,9 @@ gfc_conv_expr_descriptor (gfc_se *se, gfc_expr *expr) >else > { > /* Otherwise make a new one. */ > - if (expr->ts.type == BT_CHARACTER && expr->ts.deferred) > + if (want_assumed_type) > + parmtype = ptr_type_node; > + else if (expr->ts.type == BT_CHARACTER && expr->ts.deferred) > parmtype = gfc_typenode_for_spec (>ts); > else > parmtype = gfc_get_element_type (TREE_TYPE (desc)); > @@ -7950,7 +7952,8 @@ gfc_conv_array_parameter (gfc_se * se,
Re: Issue with ggc_delete and finalizers (was Re: New modref/ipa_modref optimization passes)
On Thu, 2020-09-24 at 08:30 +0200, Jan Hubicka wrote: > Hi, > This patch makes ggc_delete to be paired with ggc_alloc_no_dtor. > I copy same scheme as used by Martin in ipa-fnsummary, that is > creating a > static member function create_ggc hidding the ugly bits and using it > in > ipa-modref.c. > > I also noticed that modref-tree leaks memory on destruction/collapse > method and > fixed that. > > Bootstrapped/regtested x86_64-linux. It looks like you committed this as c9da53d6987af5f8ff68b58dd76a9fbc900a6a21. This appears to fix the issues seen with the GC with jit (PR jit/97169). With the previous commit, jit.sum had: # of expected passes5751 # of unexpected failures64 # of unresolved testcases 1 with a number of SIGSEGV showing up in the FAIL reports, whereas with c9da53d6987af5f8ff68b58dd76a9fbc900a6a21, jit.sum is restored to: # of expected passes10854 Thanks! Dave
Re: Issue with ggc_delete and finalizers (was Re: New modref/ipa_modref optimization passes)
Hi, This patch makes ggc_delete to be paired with ggc_alloc_no_dtor. I copy same scheme as used by Martin in ipa-fnsummary, that is creating a static member function create_ggc hidding the ugly bits and using it in ipa-modref.c. I also noticed that modref-tree leaks memory on destruction/collapse method and fixed that. Bootstrapped/regtested x86_64-linux. gcc/ChangeLog: 2020-09-24 Jan Hubicka * ipa-modref-tree.h (modref_base::collapse): Release memory. (modref_tree::create_ggc): New member function. (modref_tree::colapse): Release memory. (modref_tree::~modref_tree): New destructor. * ipa-modref.c (modref_summaries::create_ggc): New function. (analyze_function): Use create_ggc. (modref_summaries::duplicate): Likewise. (read_modref_records): Likewise. (modref_read): Likewise. diff --git a/gcc/ipa-modref-tree.h b/gcc/ipa-modref-tree.h index 3bdd3058aa1..82e959a7d46 100644 --- a/gcc/ipa-modref-tree.h +++ b/gcc/ipa-modref-tree.h @@ -95,7 +95,15 @@ struct GTY((user)) modref_base_node void collapse () { -vec_free (refs); +size_t i; +modref_ref_node *r; + +if (refs) + { + FOR_EACH_VEC_SAFE_ELT (refs, i, r) + ggc_free (r); + vec_free (refs); + } refs = NULL; every_ref = true; } @@ -214,12 +222,36 @@ struct GTY((user)) modref_tree return NULL; } + /* Return ggc allocated instance. We explicitly call destructors via + ggc_delete and do not want finalizers to be registered and + called at the garbage collection time. */ + static modref_tree *create_ggc (size_t max_bases, size_t max_refs) + { +return new (ggc_alloc_no_dtor> ()) +modref_tree (max_bases, max_refs); + } + void collapse () { -vec_free (bases); +size_t i; +modref_base_node *n; + +if (bases) + { + FOR_EACH_VEC_SAFE_ELT (bases, i, n) + { + n->collapse (); + ggc_free (n); + } + vec_free (bases); + } bases = NULL; every_base = true; } + ~modref_tree () + { +collapse (); + } }; void modref_c_tests (); diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c index 9cc90565891..43545c1fb09 100644 --- a/gcc/ipa-modref.c +++ b/gcc/ipa-modref.c @@ -84,6 +84,11 @@ public: ipa-modref pass execution needs to be analyzed in IPA mode while all other insertions leads to normal analysis. */ bool ipa; + static modref_summaries *create_ggc (symbol_table *symtab) + { +return new (ggc_alloc_no_dtor ()) +modref_summaries (symtab); + } }; /* Global variable holding all modref summaries. */ @@ -608,8 +613,7 @@ analyze_function (function *f, bool ipa) /* Initialize the summary. */ if (!summaries) -summaries = new (ggc_alloc ()) -modref_summaries (symtab); +summaries = modref_summaries::create_ggc (symtab); else /* Remove existing summary if we are re-running the pass. */ summaries->remove (cgraph_node::get (f->decl)); @@ -633,28 +637,22 @@ analyze_function (function *f, bool ipa) if (nolto) { gcc_assert (!summary->loads); - summary->loads -= new (ggc_alloc > ()) - modref_records (param_modref_max_bases, - param_modref_max_refs); + summary->loads = modref_records::create_ggc (param_modref_max_bases, + param_modref_max_refs); gcc_assert (!summary->stores); - summary->stores -= new (ggc_alloc > ()) - modref_records (param_modref_max_bases, - param_modref_max_refs); + summary->stores = modref_records::create_ggc (param_modref_max_bases, + param_modref_max_refs); } if (lto) { gcc_assert (!summary->loads_lto); - summary->loads_lto -= new (ggc_alloc > ()) - modref_records_lto (param_modref_max_bases, - param_modref_max_refs); + summary->loads_lto = modref_records_lto::create_ggc +(param_modref_max_bases, + param_modref_max_refs); gcc_assert (!summary->stores_lto); - summary->stores_lto -= new (ggc_alloc > ()) - modref_records_lto (param_modref_max_bases, - param_modref_max_refs); + summary->stores_lto = modref_records_lto::create_ggc +(param_modref_max_bases, + param_modref_max_refs); } summary->finished = false; int ecf_flags = flags_from_decl_or_type (current_function_decl); @@ -730,34 +728,30 @@ modref_summaries::duplicate (cgraph_node *, cgraph_node *, dst_data->finished = src_data->finished; if (src_data->stores) { - dst_data->stores = new (ggc_alloc > ()) -
Re: Issue with ggc_delete and finalizers (was Re: New modref/ipa_modref optimization passes)
> > Summarizing what's going on: > > We have a use-after-ggc_delete happening with the finalizers code. > > analyze_function has: > > summaries = new (ggc_alloc ()) >modref_summaries (symtab); > > ggc_alloc (as opposed to ggc_alloc_no_dtor) uses need_finalization_p > and "sees" that the class has a nontrivial dtor, and hence it passes > finalize to ggc_internal_alloc as the "f" callback. > > Within ggc_internal_alloc we have: > > if (f) > add_finalizer (result, f, s, n); > > and so that callback is registered within G.finalizers - but there's > nothing stored in the object itself to track that finalizer. > > Later, in ipa_modref_c_finalize, gcc_delete is called on the > mod_summaries object, but the finalizer is still registered in > G.finalizers with its address. > > Later, a GC happens, and if the bit for "marked" on that old > modref_summaries object happens to be cleared (with whatever that > memory is now being used for, if anything), the finalizer callback is > called, and ~modref_summaries is called with its "this" pointing at > random bits, and we have a segfault. > > This seems like a big "gotcha" in ggc_delete: it doesn't remove any > finalizer for the object, instead leaving it as a timebomb to happen in > some future collection. > > Should ggc_delete remove the finalizer? It would have to scan the > G.finalizer vecs to find them - O(N). Alternatively, perhaps we could > stash some kind of reference to the finalizer in memory near the object > (perhaps allocating a header). I think the difference between ggc_free and ggc_delete should be just like the difference between free and delete, that is, ggc_delete should call finalizer. I think the mistake is that ggc_delete would work well with ggc_alloc_no_dtor, but the patch uses ggc_alloc. I guess we do not see the problem in normal compilation since ggc_delete is used in the patch 3 times for objects with no destructor and once for object with a destructor but only at the end of compilation when ggc is not run w/o libjit. > > Or we could put a big warning on ggc_delete saying not to use it on > ggc_alloc-ed objects with dtors. I do not think it is reasonable for ggc_delete to walk all finalizers, so perhaps we just want a sanity check with --enable-checking that things are not mixed up? That is, maintain on-side hash of finalizers introduced by ggc_alloc and look it up in ggc_delete, assert on a spot with a comment saying how things are intended to work? > > I'm not sure what the best approach here is. > > There were only 4 places in the source tree where ggc_delete were used > before the patch, which added 4 places; maybe we should just remove > those new ggc_deletes and set the pointers to NULL and let them be > cleared as regular garbage? For modref we really need to release things explicitly since it runs at WPA time and I am trying to maintain the fact that WPA do not need ggc_collect runs: they have large memory footprint and it is easy to explicitly manage all memory used by symtab/optimization summaries. Of course I can reorganize the code to not have a destructor as well (which is not very ++-sy). In fact since the templates have hand written markers anyway, I was htinking of moving them off ggc memory and just walk to discover the tree pointers in them. Honza > > Dave >
Re: New modref/ipa_modref optimization passes
On Tue, 2020-09-22 at 22:23 +0200, Jan Hubicka wrote: > > On Tue, 2020-09-22 at 20:39 +0200, Jan Hubicka wrote: > > > David, > > > with jit I get the following: > > > /usr/local/x86_64-pc-linux-gnu/bin/ld: final link failed: > > > nonrepresentable section on output > > > collect2: error: ld returned 1 exit status > > > make[3]: *** [../../gcc/jit/Make-lang.in:121: libgccjit.so.0.0.1] > > > Error > > > > > > Is there a fix/workaround? > > > > I don't recognize that specific error, but googling suggests it may > > relate to position-independent code. > > > > Are you configuring with --enable-host-shared ? This is needed > > when > > enabling "jit" in --enable-languages (but slows down the compiler > > by a > > few percent, which is why "jit" isn't in "all"). > > Yes --enable-languages=all,jit --enable-host-shared. > I suppose my binutils may show the age, I will check that tomorrow. > It > looks like weird error. FWIW if you do get it to build, you can reproduce the crash via running this in builddir/gcc: [gcc] $ PRESERVE_EXECUTABLES= \ make check-jit \ RUNTESTFLAGS="-v -v -v jit.exp=test-factorial.c" [gcc] $ PATH=.:$PATH \ LD_LIBRARY_PATH=. \ LIBRARY_PATH=. \ gdb --args \ testsuite/jit/test-factorial.c.exe (taken from https://gcc.gnu.org/onlinedocs/jit/internals/index.html#running-the-test-suite )
Issue with ggc_delete and finalizers (was Re: New modref/ipa_modref optimization passes)
On Tue, 2020-09-22 at 22:24 +0200, Jan Hubicka wrote: > > On Tue, 2020-09-22 at 16:13 -0400, David Malcolm wrote: > > > On Tue, 2020-09-22 at 20:39 +0200, Jan Hubicka wrote: > > > > David, > > > > with jit I get the following: > > > > /usr/local/x86_64-pc-linux-gnu/bin/ld: final link failed: > > > > nonrepresentable section on output > > > > collect2: error: ld returned 1 exit status > > > > make[3]: *** [../../gcc/jit/Make-lang.in:121: > > > > libgccjit.so.0.0.1] > > > > Error > > > > > > > > Is there a fix/workaround? > > > > > > I don't recognize that specific error, but googling suggests it > > > may > > > relate to position-independent code. > > > > > > Are you configuring with --enable-host-shared ? This is needed > > > when > > > enabling "jit" in --enable-languages (but slows down the compiler > > > by > > > a > > > few percent, which is why "jit" isn't in "all"). > > > > > > > > > > Patch I am trying to test/debug is attached, it fixes the > > > > selftest > > > > issue > > > > and the destructor. > > > > > > Thanks. > > > > > > Sadly it doesn't fix the jit crashes, which are now in bugzilla > > > (as > > > PR > > > jit/97169). > > > > > > Without the patch, the jit testcases crash at the end of the 1st > > > in- > > > process iteration, in the dtor for the the new pass. > > > > > > With the patch the jit testcases crash inside the 3rd in-process > > > iteration, invoking a modref_summaries finalizer at whichever GC- > > > collection point happens first, I think, where the > > > modref_summaries * > > > seems to be pointing at corrupt data: > > > > > > (gdb) p *(modref_summaries *)p > > > $2 = {> = > > > {> = { > > > _vptr.function_summary_base = 0x20001, > > > m_symtab_insertion_hook = 0x1, m_symtab_removal_hook = > > > 0x10004, > > > m_symtab_duplication_hook = 0x0, m_symtab = 0x644210, > > > m_insertion_enabled = 112, m_allocator = {m_allocator = { > > > m_name = 0x0, m_id = 0, m_elts_per_block = 1, > > > m_returned_free_list = 0x7afafaf01, > > > m_virgin_free_list = 0xafafafafafaf0001 > > access > > > memory at address 0xafafafafafaf0001>, > > > m_virgin_elts_remaining = 0, m_elts_allocated = > > > 140737080343888, m_elts_free = 0, m_blocks_allocated = 0, > > > m_block_list = 0x0, m_elt_size = 6517120, m_size = 13, > > > m_initialized = false, m_location = { > > > m_filename = 0x0, m_function = 0x0, m_line = 1, > > > m_origin > > > = > > > 2947481856, m_ggc = false, > > > m_vector = 0x0}, ipa = false} > > > > > > I think this latter crash may be a pre-existing bug in how the > > > jit > > > interacts with gc finalizers. I think the finalizers are > > > accumulating > > > from in-process run to run, leading to chaos, but I need to debug > > > it > > > some more to be sure. Alternatively, is there a way that a > > > finalizer > > > is being registered, and then the object is somehow clobbered > > > without > > > the finalizer being unregistered from the vec of finalizers? > > > > Aha: this patch on top of yours seems to fix it, at least for the > > test > > I've been debugging. > > > > Calling gcc_delete on something seems to delete it without removing > > the > > finalizer, leaving the finalizer around to run on whatever the > > memory > > eventually gets reused for, leading to segfaults: > > > > diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c > > index 4b9c4db4ee9..64d314321cb 100644 > > --- a/gcc/ipa-modref.c > > +++ b/gcc/ipa-modref.c > > @@ -1372,8 +1372,6 @@ unsigned int pass_ipa_modref::execute > > (function *) > > void > > ipa_modref_c_finalize () > > { > > - if (summaries) > > -ggc_delete (summaries); > >summaries = NULL; > > } > > Ah, thanks. That is very odd behaviour of delete indeed. Summarizing what's going on: We have a use-after-ggc_delete happening with the finalizers code. analyze_function has: summaries = new (ggc_alloc ()) modref_summaries (symtab); ggc_alloc (as opposed to ggc_alloc_no_dtor) uses need_finalization_p and "sees" that the class has a nontrivial dtor, and hence it passes finalize to ggc_internal_alloc as the "f" callback. Within ggc_internal_alloc we have: if (f) add_finalizer (result, f, s, n); and so that callback is registered within G.finalizers - but there's nothing stored in the object itself to track that finalizer. Later, in ipa_modref_c_finalize, gcc_delete is called on the mod_summaries object, but the finalizer is still registered in G.finalizers with its address. Later, a GC happens, and if the bit for "marked" on that old modref_summaries object happens to be cleared (with whatever that memory is now being used for, if anything), the finalizer callback is called, and ~modref_summaries is called with its "this" pointing at random bits, and we have a segfault. This seems like a big "gotcha" in ggc_delete: it doesn't remove any finalizer for the object, instead leaving it as a timebomb to
Re: New modref/ipa_modref optimization passes
> On Tue, 2020-09-22 at 16:13 -0400, David Malcolm wrote: > > On Tue, 2020-09-22 at 20:39 +0200, Jan Hubicka wrote: > > > David, > > > with jit I get the following: > > > /usr/local/x86_64-pc-linux-gnu/bin/ld: final link failed: > > > nonrepresentable section on output > > > collect2: error: ld returned 1 exit status > > > make[3]: *** [../../gcc/jit/Make-lang.in:121: libgccjit.so.0.0.1] > > > Error > > > > > > Is there a fix/workaround? > > > > I don't recognize that specific error, but googling suggests it may > > relate to position-independent code. > > > > Are you configuring with --enable-host-shared ? This is needed when > > enabling "jit" in --enable-languages (but slows down the compiler by > > a > > few percent, which is why "jit" isn't in "all"). > > > > > > > Patch I am trying to test/debug is attached, it fixes the selftest > > > issue > > > and the destructor. > > > > Thanks. > > > > Sadly it doesn't fix the jit crashes, which are now in bugzilla (as > > PR > > jit/97169). > > > > Without the patch, the jit testcases crash at the end of the 1st in- > > process iteration, in the dtor for the the new pass. > > > > With the patch the jit testcases crash inside the 3rd in-process > > iteration, invoking a modref_summaries finalizer at whichever GC- > > collection point happens first, I think, where the modref_summaries * > > seems to be pointing at corrupt data: > > > > (gdb) p *(modref_summaries *)p > > $2 = {> = > > {> = { > > _vptr.function_summary_base = 0x20001, > > m_symtab_insertion_hook = 0x1, m_symtab_removal_hook = 0x10004, > > m_symtab_duplication_hook = 0x0, m_symtab = 0x644210, > > m_insertion_enabled = 112, m_allocator = {m_allocator = { > > m_name = 0x0, m_id = 0, m_elts_per_block = 1, > > m_returned_free_list = 0x7afafaf01, > > m_virgin_free_list = 0xafafafafafaf0001 > access > > memory at address 0xafafafafafaf0001>, > > m_virgin_elts_remaining = 0, m_elts_allocated = > > 140737080343888, m_elts_free = 0, m_blocks_allocated = 0, > > m_block_list = 0x0, m_elt_size = 6517120, m_size = 13, > > m_initialized = false, m_location = { > > m_filename = 0x0, m_function = 0x0, m_line = 1, m_origin > > = > > 2947481856, m_ggc = false, > > m_vector = 0x0}, ipa = false} > > > > I think this latter crash may be a pre-existing bug in how the jit > > interacts with gc finalizers. I think the finalizers are > > accumulating > > from in-process run to run, leading to chaos, but I need to debug it > > some more to be sure. Alternatively, is there a way that a finalizer > > is being registered, and then the object is somehow clobbered without > > the finalizer being unregistered from the vec of finalizers? > > Aha: this patch on top of yours seems to fix it, at least for the test > I've been debugging. > > Calling gcc_delete on something seems to delete it without removing the > finalizer, leaving the finalizer around to run on whatever the memory > eventually gets reused for, leading to segfaults: > > diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c > index 4b9c4db4ee9..64d314321cb 100644 > --- a/gcc/ipa-modref.c > +++ b/gcc/ipa-modref.c > @@ -1372,8 +1372,6 @@ unsigned int pass_ipa_modref::execute (function *) > void > ipa_modref_c_finalize () > { > - if (summaries) > -ggc_delete (summaries); >summaries = NULL; > } Ah, thanks. That is very odd behaviour of delete indeed. Honza > > > >
Re: New modref/ipa_modref optimization passes
> On Tue, 2020-09-22 at 20:39 +0200, Jan Hubicka wrote: > > David, > > with jit I get the following: > > /usr/local/x86_64-pc-linux-gnu/bin/ld: final link failed: > > nonrepresentable section on output > > collect2: error: ld returned 1 exit status > > make[3]: *** [../../gcc/jit/Make-lang.in:121: libgccjit.so.0.0.1] > > Error > > > > Is there a fix/workaround? > > I don't recognize that specific error, but googling suggests it may > relate to position-independent code. > > Are you configuring with --enable-host-shared ? This is needed when > enabling "jit" in --enable-languages (but slows down the compiler by a > few percent, which is why "jit" isn't in "all"). Yes --enable-languages=all,jit --enable-host-shared. I suppose my binutils may show the age, I will check that tomorrow. It looks like weird error. > > > > Patch I am trying to test/debug is attached, it fixes the selftest > > issue > > and the destructor. > > Thanks. > > Sadly it doesn't fix the jit crashes, which are now in bugzilla (as PR > jit/97169). > > Without the patch, the jit testcases crash at the end of the 1st in- > process iteration, in the dtor for the the new pass. > > With the patch the jit testcases crash inside the 3rd in-process > iteration, invoking a modref_summaries finalizer at whichever GC- > collection point happens first, I think, where the modref_summaries * > seems to be pointing at corrupt data: > > (gdb) p *(modref_summaries *)p > $2 = {> = > {> = { > _vptr.function_summary_base = 0x20001, > m_symtab_insertion_hook = 0x1, m_symtab_removal_hook = 0x10004, > m_symtab_duplication_hook = 0x0, m_symtab = 0x644210, > m_insertion_enabled = 112, m_allocator = {m_allocator = { > m_name = 0x0, m_id = 0, m_elts_per_block = 1, > m_returned_free_list = 0x7afafaf01, > m_virgin_free_list = 0xafafafafafaf0001 memory at address 0xafafafafafaf0001>, > m_virgin_elts_remaining = 0, m_elts_allocated = > 140737080343888, m_elts_free = 0, m_blocks_allocated = 0, > m_block_list = 0x0, m_elt_size = 6517120, m_size = 13, > m_initialized = false, m_location = { > m_filename = 0x0, m_function = 0x0, m_line = 1, m_origin = > 2947481856, m_ggc = false, > m_vector = 0x0}, ipa = false} > > I think this latter crash may be a pre-existing bug in how the jit > interacts with gc finalizers. I think the finalizers are accumulating > from in-process run to run, leading to chaos, but I need to debug it > some more to be sure. Alternatively, is there a way that a finalizer > is being registered, and then the object is somehow clobbered without > the finalizer being unregistered from the vec of finalizers? It looks like released memory. I saw similar problem with ggc calling finalizer in "wrong" order. David's modref tree has two layers and destructor of one was freeing the anohter that is good if you destroy first the outer type, but not good if you do it in wrong order. I will try to reproduce it. I also plan to turn the classes to pods and put them directly into the vectors. I should not have allowed David to make a template at first place :) Honza > > Dave >
Re: New modref/ipa_modref optimization passes
On Tue, 2020-09-22 at 16:13 -0400, David Malcolm wrote: > On Tue, 2020-09-22 at 20:39 +0200, Jan Hubicka wrote: > > David, > > with jit I get the following: > > /usr/local/x86_64-pc-linux-gnu/bin/ld: final link failed: > > nonrepresentable section on output > > collect2: error: ld returned 1 exit status > > make[3]: *** [../../gcc/jit/Make-lang.in:121: libgccjit.so.0.0.1] > > Error > > > > Is there a fix/workaround? > > I don't recognize that specific error, but googling suggests it may > relate to position-independent code. > > Are you configuring with --enable-host-shared ? This is needed when > enabling "jit" in --enable-languages (but slows down the compiler by > a > few percent, which is why "jit" isn't in "all"). > > > > Patch I am trying to test/debug is attached, it fixes the selftest > > issue > > and the destructor. > > Thanks. > > Sadly it doesn't fix the jit crashes, which are now in bugzilla (as > PR > jit/97169). > > Without the patch, the jit testcases crash at the end of the 1st in- > process iteration, in the dtor for the the new pass. > > With the patch the jit testcases crash inside the 3rd in-process > iteration, invoking a modref_summaries finalizer at whichever GC- > collection point happens first, I think, where the modref_summaries * > seems to be pointing at corrupt data: > > (gdb) p *(modref_summaries *)p > $2 = {> = > {> = { > _vptr.function_summary_base = 0x20001, > m_symtab_insertion_hook = 0x1, m_symtab_removal_hook = 0x10004, > m_symtab_duplication_hook = 0x0, m_symtab = 0x644210, > m_insertion_enabled = 112, m_allocator = {m_allocator = { > m_name = 0x0, m_id = 0, m_elts_per_block = 1, > m_returned_free_list = 0x7afafaf01, > m_virgin_free_list = 0xafafafafafaf0001 access > memory at address 0xafafafafafaf0001>, > m_virgin_elts_remaining = 0, m_elts_allocated = > 140737080343888, m_elts_free = 0, m_blocks_allocated = 0, > m_block_list = 0x0, m_elt_size = 6517120, m_size = 13, > m_initialized = false, m_location = { > m_filename = 0x0, m_function = 0x0, m_line = 1, m_origin > = > 2947481856, m_ggc = false, > m_vector = 0x0}, ipa = false} > > I think this latter crash may be a pre-existing bug in how the jit > interacts with gc finalizers. I think the finalizers are > accumulating > from in-process run to run, leading to chaos, but I need to debug it > some more to be sure. Alternatively, is there a way that a finalizer > is being registered, and then the object is somehow clobbered without > the finalizer being unregistered from the vec of finalizers? Aha: this patch on top of yours seems to fix it, at least for the test I've been debugging. Calling gcc_delete on something seems to delete it without removing the finalizer, leaving the finalizer around to run on whatever the memory eventually gets reused for, leading to segfaults: diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c index 4b9c4db4ee9..64d314321cb 100644 --- a/gcc/ipa-modref.c +++ b/gcc/ipa-modref.c @@ -1372,8 +1372,6 @@ unsigned int pass_ipa_modref::execute (function *) void ipa_modref_c_finalize () { - if (summaries) -ggc_delete (summaries); summaries = NULL; }
Re: New modref/ipa_modref optimization passes
Hello, this patch fixes the selftests and destructor issue noticed by David (Malcolm). According to David jit still crashes at different but I am hitting different build failure in libjit that I will need to analyze tomorrow. Bootstrapped/regtested x86_64-linux, comitted. * ipa-modref-tree.c: Add namespace selftest. (modref_tree_c_tests): Rename to ... (ipa_modref_tree_c_tests): ... this. * ipa-modref.c (pass_modref): Remove destructor. (ipa_modref_c_finalize): New function. * ipa-modref.h (ipa_modref_c_finalize): Declare. * selftest-run-tests.c (selftest::run_tests): Call ipa_modref_c_finalize. * selftest.h (ipa_modref_tree_c_tests): Declare. * toplev.c: Include ipa-modref-tree.h and ipa-modref.h (toplev::finalize): Call ipa_modref_c_finalize. diff --git a/gcc/ipa-modref-tree.c b/gcc/ipa-modref-tree.c index e37dee67fa3..a84508a2268 100644 --- a/gcc/ipa-modref-tree.c +++ b/gcc/ipa-modref-tree.c @@ -28,6 +28,7 @@ along with GCC; see the file COPYING3. If not see #if CHECKING_P +namespace selftest { static void test_insert_search_collapse () @@ -156,12 +157,14 @@ test_merge () void -modref_tree_c_tests () +ipa_modref_tree_c_tests () { test_insert_search_collapse (); test_merge (); } +} // namespace selftest + #endif void diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c index af0b710333e..ac7579a9e75 100644 --- a/gcc/ipa-modref.c +++ b/gcc/ipa-modref.c @@ -767,12 +770,6 @@ class pass_modref : public gimple_opt_pass pass_modref (gcc::context *ctxt) : gimple_opt_pass (pass_data_modref, ctxt) {} -~pass_modref () - { - ggc_delete (summaries); - summaries = NULL; - } - /* opt_pass methods: */ opt_pass *clone () { @@ -1373,4 +1370,14 @@ unsigned int pass_ipa_modref::execute (function *) return 0; } +/* Summaries must stay alive until end of compilation. */ + +void +ipa_modref_c_finalize () +{ + if (summaries) +ggc_delete (summaries); + summaries = NULL; +} + #include "gt-ipa-modref.h" diff --git a/gcc/ipa-modref.h b/gcc/ipa-modref.h index 6f979200cc2..6cccdfe7af3 100644 --- a/gcc/ipa-modref.h +++ b/gcc/ipa-modref.h @@ -44,5 +44,6 @@ struct GTY(()) modref_summary }; modref_summary *get_modref_function_summary (cgraph_node *func); +void ipa_modref_c_finalize (); #endif diff --git a/gcc/selftest-run-tests.c b/gcc/selftest-run-tests.c index f0a81d43fd6..7a89b2df5bd 100644 --- a/gcc/selftest-run-tests.c +++ b/gcc/selftest-run-tests.c @@ -90,6 +90,7 @@ selftest::run_tests () read_rtl_function_c_tests (); digraph_cc_tests (); tristate_cc_tests (); + ipa_modref_tree_c_tests (); /* Higher-level tests, or for components that other selftests don't rely on. */ diff --git a/gcc/selftest.h b/gcc/selftest.h index 5cffa13aedd..6c6c7f28675 100644 --- a/gcc/selftest.h +++ b/gcc/selftest.h @@ -268,6 +268,7 @@ extern void vec_perm_indices_c_tests (); extern void wide_int_cc_tests (); extern void opt_proposer_c_tests (); extern void dbgcnt_c_tests (); +extern void ipa_modref_tree_c_tests (); extern int num_passes; diff --git a/gcc/toplev.c b/gcc/toplev.c index cdd4b5b4f92..a4cb8bb262e 100644 --- a/gcc/toplev.c +++ b/gcc/toplev.c @@ -84,6 +84,8 @@ along with GCC; see the file COPYING3. If not see #include "dump-context.h" #include "print-tree.h" #include "optinfo-emit-json.h" +#include "ipa-modref-tree.h" +#include "ipa-modref.h" #if defined(DBX_DEBUGGING_INFO) || defined(XCOFF_DEBUGGING_INFO) #include "dbxout.h" @@ -2497,6 +2499,7 @@ toplev::finalize (void) /* Needs to be called before cgraph_c_finalize since it uses symtab. */ ipa_reference_c_finalize (); ipa_fnsummary_c_finalize (); + ipa_modref_c_finalize (); cgraph_c_finalize (); cgraphunit_c_finalize ();
Re: New modref/ipa_modref optimization passes
On Tue, 2020-09-22 at 20:39 +0200, Jan Hubicka wrote: > David, > with jit I get the following: > /usr/local/x86_64-pc-linux-gnu/bin/ld: final link failed: > nonrepresentable section on output > collect2: error: ld returned 1 exit status > make[3]: *** [../../gcc/jit/Make-lang.in:121: libgccjit.so.0.0.1] > Error > > Is there a fix/workaround? I don't recognize that specific error, but googling suggests it may relate to position-independent code. Are you configuring with --enable-host-shared ? This is needed when enabling "jit" in --enable-languages (but slows down the compiler by a few percent, which is why "jit" isn't in "all"). > Patch I am trying to test/debug is attached, it fixes the selftest > issue > and the destructor. Thanks. Sadly it doesn't fix the jit crashes, which are now in bugzilla (as PR jit/97169). Without the patch, the jit testcases crash at the end of the 1st in- process iteration, in the dtor for the the new pass. With the patch the jit testcases crash inside the 3rd in-process iteration, invoking a modref_summaries finalizer at whichever GC- collection point happens first, I think, where the modref_summaries * seems to be pointing at corrupt data: (gdb) p *(modref_summaries *)p $2 = {> = {> = { _vptr.function_summary_base = 0x20001, m_symtab_insertion_hook = 0x1, m_symtab_removal_hook = 0x10004, m_symtab_duplication_hook = 0x0, m_symtab = 0x644210, m_insertion_enabled = 112, m_allocator = {m_allocator = { m_name = 0x0, m_id = 0, m_elts_per_block = 1, m_returned_free_list = 0x7afafaf01, m_virgin_free_list = 0xafafafafafaf0001 , m_virgin_elts_remaining = 0, m_elts_allocated = 140737080343888, m_elts_free = 0, m_blocks_allocated = 0, m_block_list = 0x0, m_elt_size = 6517120, m_size = 13, m_initialized = false, m_location = { m_filename = 0x0, m_function = 0x0, m_line = 1, m_origin = 2947481856, m_ggc = false, m_vector = 0x0}, ipa = false} I think this latter crash may be a pre-existing bug in how the jit interacts with gc finalizers. I think the finalizers are accumulating from in-process run to run, leading to chaos, but I need to debug it some more to be sure. Alternatively, is there a way that a finalizer is being registered, and then the object is somehow clobbered without the finalizer being unregistered from the vec of finalizers? Dave
Re: New modref/ipa_modref optimization passes
> On Sun, 2020-09-20 at 19:30 +0200, Jan Hubicka wrote: > > > > > [...] > > > > Should new C++ source files have a .cc suffix, rather than .c? > > > > > > [...] > > > > > > > + $(srcdir)/ipa-modref.h $(srcdir)/ipa-modref.c \ > > > > > > ...which would affect this^ > > > > I was wondering about that and decided to stay with .c since it is > > what > > other ipa passes use. I can rename the files. > > Given that they're in the source tree now, maybe better to wait until > some mass renaming in the future? At the same time, I am only having patches against it, and I have no problem update the name. > > > We have some sources with > > .c extension and others with .cc while they are now all C++. Is there > > some plan to clean it up? > > I think we've been avoiding it, partly out of a concern of making > backports harder, and also because someone has to do the work. > > That said, it's yet another unfinished transition, and is technical > debt for the project. It's confusing to newcomers. > > It's been bugging me for a while, so I might take a look at doing it in > this cycle. Agreed. It would be nice to do the mass rename and at the same time make a sane directory structure so newcomers can locate optimization passes and other components. David Cepelik was my student so he may have some feedback about what he found hard. I think main stoppers was the garbage collector (since he decided to do the template for modref tree) and the flat includes that breaks if you do not do them in right order and resolve all dependencies. Honza > > Dave >
Re: New modref/ipa_modref optimization passes
David, with jit I get the following: /usr/local/x86_64-pc-linux-gnu/bin/ld: final link failed: nonrepresentable section on output collect2: error: ld returned 1 exit status make[3]: *** [../../gcc/jit/Make-lang.in:121: libgccjit.so.0.0.1] Error Is there a fix/workaround? Patch I am trying to test/debug is attached, it fixes the selftest issue and the destructor. Honza diff --git a/gcc/ipa-modref-tree.c b/gcc/ipa-modref-tree.c index e37dee67fa3..a84508a2268 100644 --- a/gcc/ipa-modref-tree.c +++ b/gcc/ipa-modref-tree.c @@ -28,6 +28,7 @@ along with GCC; see the file COPYING3. If not see #if CHECKING_P +namespace selftest { static void test_insert_search_collapse () @@ -156,12 +157,14 @@ test_merge () void -modref_tree_c_tests () +ipa_modref_tree_c_tests () { test_insert_search_collapse (); test_merge (); } +} // namespace selftest + #endif void diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c index af0b710333e..ac7579a9e75 100644 --- a/gcc/ipa-modref.c +++ b/gcc/ipa-modref.c @@ -767,12 +770,6 @@ class pass_modref : public gimple_opt_pass pass_modref (gcc::context *ctxt) : gimple_opt_pass (pass_data_modref, ctxt) {} -~pass_modref () - { - ggc_delete (summaries); - summaries = NULL; - } - /* opt_pass methods: */ opt_pass *clone () { @@ -1373,4 +1370,14 @@ unsigned int pass_ipa_modref::execute (function *) return 0; } +/* Summaries must stay alive until end of compilation. */ + +void +ipa_modref_c_finalize () +{ + if (summaries) +ggc_delete (summaries); + summaries = NULL; +} + #include "gt-ipa-modref.h" diff --git a/gcc/ipa-modref.h b/gcc/ipa-modref.h index 6f979200cc2..6cccdfe7af3 100644 --- a/gcc/ipa-modref.h +++ b/gcc/ipa-modref.h @@ -44,5 +44,6 @@ struct GTY(()) modref_summary }; modref_summary *get_modref_function_summary (cgraph_node *func); +void ipa_modref_c_finalize (); #endif diff --git a/gcc/selftest-run-tests.c b/gcc/selftest-run-tests.c index f0a81d43fd6..7a89b2df5bd 100644 --- a/gcc/selftest-run-tests.c +++ b/gcc/selftest-run-tests.c @@ -90,6 +90,7 @@ selftest::run_tests () read_rtl_function_c_tests (); digraph_cc_tests (); tristate_cc_tests (); + ipa_modref_tree_c_tests (); /* Higher-level tests, or for components that other selftests don't rely on. */ diff --git a/gcc/selftest.h b/gcc/selftest.h index 5cffa13aedd..6c6c7f28675 100644 --- a/gcc/selftest.h +++ b/gcc/selftest.h @@ -268,6 +268,7 @@ extern void vec_perm_indices_c_tests (); extern void wide_int_cc_tests (); extern void opt_proposer_c_tests (); extern void dbgcnt_c_tests (); +extern void ipa_modref_tree_c_tests (); extern int num_passes; diff --git a/gcc/toplev.c b/gcc/toplev.c index cdd4b5b4f92..a4cb8bb262e 100644 --- a/gcc/toplev.c +++ b/gcc/toplev.c @@ -84,6 +84,8 @@ along with GCC; see the file COPYING3. If not see #include "dump-context.h" #include "print-tree.h" #include "optinfo-emit-json.h" +#include "ipa-modref-tree.h" +#include "ipa-modref.h" #if defined(DBX_DEBUGGING_INFO) || defined(XCOFF_DEBUGGING_INFO) #include "dbxout.h" @@ -2497,6 +2499,7 @@ toplev::finalize (void) /* Needs to be called before cgraph_c_finalize since it uses symtab. */ ipa_reference_c_finalize (); ipa_fnsummary_c_finalize (); + ipa_modref_c_finalize (); cgraph_c_finalize (); cgraphunit_c_finalize ();
Re: New modref/ipa_modref optimization passes
> > Thanks; with that it survives the first in-process iteration, but then > dies inside the 3rd in-process iteration, on a different finalizer. > I'm beginning to suspect a pre-existing bad interaction between > finalizers and jit which perhaps this patch has exposed. > > I'll continue to investigate it. I will look into it tonight (I have meeting now about remote teaching), I think the pass should not have destructor but there should be fini function as other passes do, so I will clean this up. Those dtors make no sense and there also seems to be some memory leaks. Honza > > Dave >
Re: New modref/ipa_modref optimization passes
On Tue, 2020-09-22 at 09:07 +0200, Jan Hubicka wrote: > > > (gdb) p summaries > > > $3 = (fast_function_summary *) 0x0 > > > > > > I'm still investigating (but may have to call halt for the > > > night), but > > > this could be an underlying issue with the new passes; the jit > > > testsuite runs with the equivalent of: > > > > > > --param=ggc-min-expand=0 --param=ggc-min-heapsize=0 > > > > > > throughout to shake out GC issues (to do a full collection at > > > each GC > > > opportunity). > > > > > > Was this code tested with the jit? Do you see issues in cc1 if > > > you set > > > those params? Anyone else seeing "random" crashes? > > > > I suppose this happes when pass gets constructed but no summary is > > computed. Dos the NULL pointer guard here help? > > Hi, > I am currently in train and can not test the patch easilly, but this > should help. If you run the pass on empty input then the destruction > happens with NULL summaries pointer. > > My apologizes for that. > Honza > > diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c > index af0b710333e..cd92b5a81d3 100644 > --- a/gcc/ipa-modref.c > +++ b/gcc/ipa-modref.c > @@ -769,7 +885,8 @@ class pass_modref : public gimple_opt_pass > > ~pass_modref () >{ > - ggc_delete (summaries); > + if (summaries) > + ggc_delete (summaries); > summaries = NULL; >} Thanks; with that it survives the first in-process iteration, but then dies inside the 3rd in-process iteration, on a different finalizer. I'm beginning to suspect a pre-existing bad interaction between finalizers and jit which perhaps this patch has exposed. I'll continue to investigate it. Dave
Re: New modref/ipa_modref optimization passes
On 9/21/20 10:10 AM, Richard Biener wrote: I see, so you would expect call to alsize to initialize things in array15_unkonwn type? That would work too. Yes, that's my expectation. But let's see what fortran folks say. RFC patch attached; I think the following should work, but I am not sure whether I missed something. I wonder what to do about '!GCC$ NO_ARG_CHECK :: x but that seems to work fine (creates void* type) and as it only permits assumed size or scalar variables, the descriptor issue does not occur. Thoughts? Tobias gcc/fortran/ChangeLog: * trans-array.c (gfc_conv_expr_descriptor): (gfc_conv_array_parameter): * trans-array.h (gfc_conv_expr_descriptor): gcc/fortran/trans-array.c | 15 +-- gcc/fortran/trans-array.h | 3 ++- 2 files changed, 11 insertions(+), 7 deletions(-) diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c index 6566c47d4ae..a5d1b477a0a 100644 --- a/gcc/fortran/trans-array.c +++ b/gcc/fortran/trans-array.c @@ -7216,7 +7216,7 @@ walk_coarray (gfc_expr *e) function call. */ void -gfc_conv_expr_descriptor (gfc_se *se, gfc_expr *expr) +gfc_conv_expr_descriptor (gfc_se *se, gfc_expr *expr, bool want_assumed_type) { gfc_ss *ss; gfc_ss_type ss_type; @@ -7611,7 +7611,9 @@ gfc_conv_expr_descriptor (gfc_se *se, gfc_expr *expr) else { /* Otherwise make a new one. */ - if (expr->ts.type == BT_CHARACTER && expr->ts.deferred) + if (want_assumed_type) + parmtype = ptr_type_node; + else if (expr->ts.type == BT_CHARACTER && expr->ts.deferred) parmtype = gfc_typenode_for_spec (>ts); else parmtype = gfc_get_element_type (TREE_TYPE (desc)); @@ -7950,7 +7952,8 @@ gfc_conv_array_parameter (gfc_se * se, gfc_expr * expr, bool g77, { if (sym->attr.dummy || sym->attr.result) { - gfc_conv_expr_descriptor (se, expr); + gfc_conv_expr_descriptor (se, expr, + fsym && fsym->ts.type == BT_ASSUMED); tmp = se->expr; } if (size) @@ -8014,7 +8017,7 @@ gfc_conv_array_parameter (gfc_se * se, gfc_expr * expr, bool g77, if (no_pack || array_constructor || good_allocatable || ultimate_alloc_comp) { - gfc_conv_expr_descriptor (se, expr); + gfc_conv_expr_descriptor (se, expr, fsym && fsym->ts.type == BT_ASSUMED); /* Deallocate the allocatable components of structures that are not variable. */ if ((expr->ts.type == BT_DERIVED || expr->ts.type == BT_CLASS) @@ -8037,7 +8040,7 @@ gfc_conv_array_parameter (gfc_se * se, gfc_expr * expr, bool g77, if (this_array_result) { /* Result of the enclosing function. */ - gfc_conv_expr_descriptor (se, expr); + gfc_conv_expr_descriptor (se, expr, fsym && fsym->ts.type == BT_ASSUMED); if (size) array_parameter_size (se->expr, expr, size); se->expr = gfc_build_addr_expr (NULL_TREE, se->expr); @@ -8053,7 +8056,7 @@ gfc_conv_array_parameter (gfc_se * se, gfc_expr * expr, bool g77, { /* Every other type of array. */ se->want_pointer = 1; - gfc_conv_expr_descriptor (se, expr); + gfc_conv_expr_descriptor (se, expr, fsym && fsym->ts.type == BT_ASSUMED); if (size) array_parameter_size (build_fold_indirect_ref_loc (input_location, diff --git a/gcc/fortran/trans-array.h b/gcc/fortran/trans-array.h index e561605aaed..be3b1b79860 100644 --- a/gcc/fortran/trans-array.h +++ b/gcc/fortran/trans-array.h @@ -143,7 +143,8 @@ void gfc_get_dataptr_offset (stmtblock_t*, tree, tree, tree, bool, gfc_expr*); /* Obtain the span of an array. */ tree gfc_get_array_span (tree, gfc_expr *); /* Evaluate an array expression. */ -void gfc_conv_expr_descriptor (gfc_se *, gfc_expr *); +void gfc_conv_expr_descriptor (gfc_se *, gfc_expr *, + bool want_assumed_type = false); /* Convert an array for passing as an actual function parameter. */ void gfc_conv_array_parameter (gfc_se *, gfc_expr *, bool, const gfc_symbol *, const char *, tree *);
Re: New modref/ipa_modref optimization passes
> > (gdb) p summaries > > $3 = (fast_function_summary *) 0x0 > > > > I'm still investigating (but may have to call halt for the night), but > > this could be an underlying issue with the new passes; the jit > > testsuite runs with the equivalent of: > > > > --param=ggc-min-expand=0 --param=ggc-min-heapsize=0 > > > > throughout to shake out GC issues (to do a full collection at each GC > > opportunity). > > > > Was this code tested with the jit? Do you see issues in cc1 if you set > > those params? Anyone else seeing "random" crashes? > > I suppose this happes when pass gets constructed but no summary is > computed. Dos the NULL pointer guard here help? Hi, I am currently in train and can not test the patch easilly, but this should help. If you run the pass on empty input then the destruction happens with NULL summaries pointer. My apologizes for that. Honza diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c index af0b710333e..cd92b5a81d3 100644 --- a/gcc/ipa-modref.c +++ b/gcc/ipa-modref.c @@ -769,7 +885,8 @@ class pass_modref : public gimple_opt_pass ~pass_modref () { - ggc_delete (summaries); + if (summaries) + ggc_delete (summaries); summaries = NULL; } > > Honza > > > > Thanks > > Dave > > > >
Re: New modref/ipa_modref optimization passes
> On Sun, 2020-09-20 at 19:30 +0200, Jan Hubicka wrote: > > > On Sun, 2020-09-20 at 00:32 +0200, Jan Hubicka wrote: > > > > Hi, > > > > this is cleaned up version of the patch. I removed unfinished > > > > bits, > > > > fixed > > > > propagation, cleaned it up and fixed fallout. > > > > > > [...] > > > > > > > While there are several areas for improvements but I think it is > > > > not > > > > in shape > > > > for mainline and rest can be dealt with incrementally. > > > > > > FWIW I think you typoed: > > > "not in shape for mainline" > > > when you meant: > > > "now in shape for mainline" > > > given... > > > > Yep, sorry for that :) > > I've started seeing crashes in the jit testsuite even with trivial > inputs, which are happening at pass_modref::~pass_modref at: > > 772 ggc_delete (summaries); > > on the first in-process iteration of the code, with: > > (gdb) p summaries > $3 = (fast_function_summary *) 0x0 > > I'm still investigating (but may have to call halt for the night), but > this could be an underlying issue with the new passes; the jit > testsuite runs with the equivalent of: > > --param=ggc-min-expand=0 --param=ggc-min-heapsize=0 > > throughout to shake out GC issues (to do a full collection at each GC > opportunity). > > Was this code tested with the jit? Do you see issues in cc1 if you set > those params? Anyone else seeing "random" crashes? I suppose this happes when pass gets constructed but no summary is computed. Dos the NULL pointer guard here help? Honza > > Thanks > Dave > >
Re: New modref/ipa_modref optimization passes
On Sun, 2020-09-20 at 19:30 +0200, Jan Hubicka wrote: > > On Sun, 2020-09-20 at 00:32 +0200, Jan Hubicka wrote: > > > Hi, > > > this is cleaned up version of the patch. I removed unfinished > > > bits, > > > fixed > > > propagation, cleaned it up and fixed fallout. > > > > [...] > > > > > While there are several areas for improvements but I think it is > > > not > > > in shape > > > for mainline and rest can be dealt with incrementally. > > > > FWIW I think you typoed: > > "not in shape for mainline" > > when you meant: > > "now in shape for mainline" > > given... > > Yep, sorry for that :) I've started seeing crashes in the jit testsuite even with trivial inputs, which are happening at pass_modref::~pass_modref at: 772 ggc_delete (summaries); on the first in-process iteration of the code, with: (gdb) p summaries $3 = (fast_function_summary *) 0x0 I'm still investigating (but may have to call halt for the night), but this could be an underlying issue with the new passes; the jit testsuite runs with the equivalent of: --param=ggc-min-expand=0 --param=ggc-min-heapsize=0 throughout to shake out GC issues (to do a full collection at each GC opportunity). Was this code tested with the jit? Do you see issues in cc1 if you set those params? Anyone else seeing "random" crashes? Thanks Dave
Re: New modref/ipa_modref optimization passes
On Mon, 21 Sep 2020, Jan Hubicka wrote: > > > > > > The problem is: > > > > > > alsize (struct array15_unknown & restrict a) > > > { > > > ... > > > _2 = *a_13(D).dtype.rank; > > > _3 = (integer(kind=8)) _2; > > > ... > > > } > > > } > > > and in main: > > > > > > struct array02_integer(kind=4) am; > > >: > > > MEM [(struct dtype_type *) + 24B] = {}; > > > am.dtype.elem_len = 4; > > > am.dtype.rank = 2; > > > am.dtype.type = 1; > > > ... > > > _52 = alsize (); > > > > > > Here array15_unknown and array02_integer are different structures with > > > different canonical types and thus we end up disambiguating the accesses > > > via base alias sets. > > > > > > My understanding is that this _unknown array descriptor is supposed to > > > be universal and work with all kinds of arrays. > > > > But the FE builds a new descriptor for each individual call and thus > > should build a universal descriptor for a call to an universal > > descriptor argument. > > I see, so you would expect call to alsize to initialize things in > array15_unkonwn type? That would work too. Yes, that's my expectation. But let's see what fortran folks say. Richard.
Re: New modref/ipa_modref optimization passes
> > > > The problem is: > > > > alsize (struct array15_unknown & restrict a) > > { > > ... > > _2 = *a_13(D).dtype.rank; > > _3 = (integer(kind=8)) _2; > > ... > > } > > } > > and in main: > > > > struct array02_integer(kind=4) am; > >: > > MEM [(struct dtype_type *) + 24B] = {}; > > am.dtype.elem_len = 4; > > am.dtype.rank = 2; > > am.dtype.type = 1; > > ... > > _52 = alsize (); > > > > Here array15_unknown and array02_integer are different structures with > > different canonical types and thus we end up disambiguating the accesses > > via base alias sets. > > > > My understanding is that this _unknown array descriptor is supposed to > > be universal and work with all kinds of arrays. > > But the FE builds a new descriptor for each individual call and thus > should build a universal descriptor for a call to an universal > descriptor argument. I see, so you would expect call to alsize to initialize things in array15_unkonwn type? That would work too. Honza
Re: New modref/ipa_modref optimization passes
On Mon, 21 Sep 2020, Jan Hubicka wrote: > > On Sun, 20 Sep 2020, Jan Hubicka wrote: > > > > > Hi, > > > this is patch I am using to fix the assumed_alias_type.f90 failure by > > > simply arranging alias set 0 for the problematic array descriptor. > > > > There's no such named testcase on trunk. Can you be more specific > > as to the problem at hand? It looks like gfortran.dg/assumed_type_9.f90 > > execute FAILs at the moment. > > > > In particular how's this not an issue w/o IPA modref? > > > > > For TYPE(*) I think the object itself cannot be accessed but for > > arrays the meta-info in the array descriptor can. Now my question > > would be why the Fortran FE at the call site does not build an > > appropriately typed array descriptor? > > > > CCing the fortran list. > > The problem is: > > alsize (struct array15_unknown & restrict a) > { > ... > _2 = *a_13(D).dtype.rank; > _3 = (integer(kind=8)) _2; > ... > } > } > and in main: > > struct array02_integer(kind=4) am; >: > MEM [(struct dtype_type *) + 24B] = {}; > am.dtype.elem_len = 4; > am.dtype.rank = 2; > am.dtype.type = 1; > ... > _52 = alsize (); > > Here array15_unknown and array02_integer are different structures with > different canonical types and thus we end up disambiguating the accesses > via base alias sets. > > My understanding is that this _unknown array descriptor is supposed to > be universal and work with all kinds of arrays. But the FE builds a new descriptor for each individual call and thus should build a universal descriptor for a call to an universal descriptor argument. Richard. > Wihtout modref this works because alsize is not inlined (we think code > size would grow). Forcing inliner to inline stil leads to working code > because we first constant propagate the pointer and then we see accesses > from same base DECL thus bypass the TBAA checks. Disabling the > constant propagation leads to wrong code as wel. > > Honza > -- Richard Biener SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imend
Re: New modref/ipa_modref optimization passes
> On Sun, 20 Sep 2020, Jan Hubicka wrote: > > > Hi, > > this is patch I am using to fix the assumed_alias_type.f90 failure by > > simply arranging alias set 0 for the problematic array descriptor. > > There's no such named testcase on trunk. Can you be more specific > as to the problem at hand? It looks like gfortran.dg/assumed_type_9.f90 > execute FAILs at the moment. > > In particular how's this not an issue w/o IPA modref? > > For TYPE(*) I think the object itself cannot be accessed but for > arrays the meta-info in the array descriptor can. Now my question > would be why the Fortran FE at the call site does not build an > appropriately typed array descriptor? > > CCing the fortran list. The problem is: alsize (struct array15_unknown & restrict a) { ... _2 = *a_13(D).dtype.rank; _3 = (integer(kind=8)) _2; ... } } and in main: struct array02_integer(kind=4) am; : MEM [(struct dtype_type *) + 24B] = {}; am.dtype.elem_len = 4; am.dtype.rank = 2; am.dtype.type = 1; ... _52 = alsize (); Here array15_unknown and array02_integer are different structures with different canonical types and thus we end up disambiguating the accesses via base alias sets. My understanding is that this _unknown array descriptor is supposed to be universal and work with all kinds of arrays. Wihtout modref this works because alsize is not inlined (we think code size would grow). Forcing inliner to inline stil leads to working code because we first constant propagate the pointer and then we see accesses from same base DECL thus bypass the TBAA checks. Disabling the constant propagation leads to wrong code as wel. Honza
Re: New modref/ipa_modref optimization passes
On Sun, 20 Sep 2020, Jan Hubicka wrote: > Hi, > this is patch I am using to fix the assumed_alias_type.f90 failure by > simply arranging alias set 0 for the problematic array descriptor. There's no such named testcase on trunk. Can you be more specific as to the problem at hand? It looks like gfortran.dg/assumed_type_9.f90 execute FAILs at the moment. In particular how's this not an issue w/o IPA modref? For TYPE(*) I think the object itself cannot be accessed but for arrays the meta-info in the array descriptor can. Now my question would be why the Fortran FE at the call site does not build an appropriately typed array descriptor? CCing the fortran list. > I am not sure this is the best option, but I suppose it is better than > setting all array descritors to have same canonical type (as done by > LTO)? > > Honza > > * trans-types.c (gfc_get_array_type_bounds): Set alias set to 0 for > arrays of unknown element type. > diff --git a/gcc/fortran/trans-types.c b/gcc/fortran/trans-types.c > index 26fdb2803a7..bef3d270c06 100644 > --- a/gcc/fortran/trans-types.c > +++ b/gcc/fortran/trans-types.c > @@ -1903,6 +1903,12 @@ gfc_get_array_type_bounds (tree etype, int dimen, int > codimen, tree * lbound, >base_type = gfc_get_array_descriptor_base (dimen, codimen, false); >TYPE_CANONICAL (fat_type) = base_type; >TYPE_STUB_DECL (fat_type) = TYPE_STUB_DECL (base_type); > + /* Arrays of unknown type must alias with all array descriptors. */ > + if (etype == ptr_type_node) > +{ > + TYPE_ALIAS_SET (base_type) = 0; > + TYPE_ALIAS_SET (fat_type) = 0; > +} > >tmp = TYPE_NAME (etype); >if (tmp && TREE_CODE (tmp) == TYPE_DECL) > -- Richard Biener SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imend
Re: New modref/ipa_modref optimization passes
On Sun, 2020-09-20 at 19:30 +0200, Jan Hubicka wrote: > > [...] > > Should new C++ source files have a .cc suffix, rather than .c? > > > > [...] > > > > > + $(srcdir)/ipa-modref.h $(srcdir)/ipa-modref.c \ > > > > ...which would affect this^ > > I was wondering about that and decided to stay with .c since it is > what > other ipa passes use. I can rename the files. Given that they're in the source tree now, maybe better to wait until some mass renaming in the future? > We have some sources with > .c extension and others with .cc while they are now all C++. Is there > some plan to clean it up? I think we've been avoiding it, partly out of a concern of making backports harder, and also because someone has to do the work. That said, it's yet another unfinished transition, and is technical debt for the project. It's confusing to newcomers. It's been bugging me for a while, so I might take a look at doing it in this cycle. Dave
Re: New modref/ipa_modref optimization passes
> On Sun, 2020-09-20 at 00:32 +0200, Jan Hubicka wrote: > > Hi, > > this is cleaned up version of the patch. I removed unfinished bits, > > fixed > > propagation, cleaned it up and fixed fallout. > > [...] > > > While there are several areas for improvements but I think it is not > > in shape > > for mainline and rest can be dealt with incrementally. > > FWIW I think you typoed: > "not in shape for mainline" > when you meant: > "now in shape for mainline" > given... Yep, sorry for that :) > > Should new C++ source files have a .cc suffix, rather than .c? > > [...] > > > + $(srcdir)/ipa-modref.h $(srcdir)/ipa-modref.c \ > > ...which would affect this^ I was wondering about that and decided to stay with .c since it is what other ipa passes use. I can rename the files. We have some sources with .c extension and others with .cc while they are now all C++. Is there some plan to clean it up? > > +void > > +modref_tree_c_tests () > > +{ > > + test_insert_search_collapse (); > > + test_merge (); > > +} > > + > > +#endif > > It looks like these tests aren't being called anywhere; should they be? > > The convention is to put such tests inside namespace selftest and to > call the "run all the tests in this file" function from > selftest::run_tests. > > If you change this to be a .cc file, then the _c_ in the function name > should become a _cc_. > > [...] > > > diff --git a/gcc/ipa-modref-tree.h b/gcc/ipa-modref-tree.h > > new file mode 100644 > > index 000..3bdd3058aa1 > > --- /dev/null > > +++ b/gcc/ipa-modref-tree.h > > [...] > > > +void modref_c_tests (); > > Likewise here; the convention is to declare these within > selftest.h > > [...] Thanks, indeed is seems that the self tests code was dropped at some point which I did not noticed. I will fix that up. > > Hope this is constructive It is :) Honza > Dave >
Re: New modref/ipa_modref optimization passes
On Sun, 2020-09-20 at 00:32 +0200, Jan Hubicka wrote: > Hi, > this is cleaned up version of the patch. I removed unfinished bits, > fixed > propagation, cleaned it up and fixed fallout. [...] > While there are several areas for improvements but I think it is not > in shape > for mainline and rest can be dealt with incrementally. FWIW I think you typoed: "not in shape for mainline" when you meant: "now in shape for mainline" given... > > Bootstrapped/regtested x86_64-linux including ada, d and go. I plan > to commit > it after bit more testing tomorrow. ...this, which suggests the opposite meaning. I didn't look at the guts of the patch, but I spotted some nits. > 2020-09-19 David Cepelik > Jan Hubicka > > * Makefile.in: Add ipa-modref.c and ipa-modref-tree.c. > * alias.c: (reference_alias_ptr_type_1): Export. > * alias.h (reference_alias_ptr_type_1): Declare. > * common.opt (fipa-modref): New. > * gengtype.c (open_base_files): Add ipa-modref-tree.h and ipa- > modref.h > * ipa-modref-tree.c: New file. > * ipa-modref-tree.h: New file. > * ipa-modref.c: New file. Should new C++ source files have a .cc suffix, rather than .c? [...] > + $(srcdir)/ipa-modref.h $(srcdir)/ipa-modref.c \ ...which would affect this^ [...] > diff --git a/gcc/ipa-modref-tree.c b/gcc/ipa-modref-tree.c > new file mode 100644 > index 000..e37dee67fa3 > --- /dev/null > +++ b/gcc/ipa-modref-tree.c [...] > +#if CHECKING_P > + > + > +static void > +test_insert_search_collapse () > +{ [...] > +} > + > +static void > +test_merge () > +{ [...] > +} > + > + > +void > +modref_tree_c_tests () > +{ > + test_insert_search_collapse (); > + test_merge (); > +} > + > +#endif It looks like these tests aren't being called anywhere; should they be? The convention is to put such tests inside namespace selftest and to call the "run all the tests in this file" function from selftest::run_tests. If you change this to be a .cc file, then the _c_ in the function name should become a _cc_. [...] > diff --git a/gcc/ipa-modref-tree.h b/gcc/ipa-modref-tree.h > new file mode 100644 > index 000..3bdd3058aa1 > --- /dev/null > +++ b/gcc/ipa-modref-tree.h [...] > +void modref_c_tests (); Likewise here; the convention is to declare these within selftest.h [...] Hope this is constructive Dave
Re: New modref/ipa_modref optimization passes
Hi, this is patch I am using to fix the assumed_alias_type.f90 failure by simply arranging alias set 0 for the problematic array descriptor. I am not sure this is the best option, but I suppose it is better than setting all array descritors to have same canonical type (as done by LTO)? Honza * trans-types.c (gfc_get_array_type_bounds): Set alias set to 0 for arrays of unknown element type. diff --git a/gcc/fortran/trans-types.c b/gcc/fortran/trans-types.c index 26fdb2803a7..bef3d270c06 100644 --- a/gcc/fortran/trans-types.c +++ b/gcc/fortran/trans-types.c @@ -1903,6 +1903,12 @@ gfc_get_array_type_bounds (tree etype, int dimen, int codimen, tree * lbound, base_type = gfc_get_array_descriptor_base (dimen, codimen, false); TYPE_CANONICAL (fat_type) = base_type; TYPE_STUB_DECL (fat_type) = TYPE_STUB_DECL (base_type); + /* Arrays of unknown type must alias with all array descriptors. */ + if (etype == ptr_type_node) +{ + TYPE_ALIAS_SET (base_type) = 0; + TYPE_ALIAS_SET (fat_type) = 0; +} tmp = TYPE_NAME (etype); if (tmp && TREE_CODE (tmp) == TYPE_DECL)
Re: New modref/ipa_modref optimization passes
Hi, this is cleaned up version of the patch. I removed unfinished bits, fixed propagation, cleaned it up and fixed fallout. At present modref does bare minimum of what it can do and only collects base/ref alias sets of accesses to disambiguate uding them. I have followup patches adding more features. Like pure-const pass, modref is both local and IPA pass (so early optimizations benefits from it). It works with both LTO and non-LTO. LTO tracking is bit harder since instead of alias sets one needs to track types alias sets are derived from (because alias sets change from compile to link time). For that reason I had to exprt reference_alias_ptr_type_1 so I can lookup same type as get_alias_set does. There is an assert checking that both functions are in sync. Building cc1plus with LTO I get: Alias oracle query stats: refs_may_alias_p: 59342803 disambiguations, 69597406 queries ref_maybe_used_by_call_p: 139290 disambiguations, 60227579 queries call_may_clobber_ref_p: 17604 disambiguations, 23290 queries nonoverlapping_component_refs_p: 0 disambiguations, 37088 queries nonoverlapping_refs_since_match_p: 19439 disambiguations, 55079 must overlaps, 76966 queries aliasing_component_refs_p: 54891 disambiguations, 756375 queries TBAA oracle: 22475946 disambiguations 51373119 queries 15188806 are in alias set 0 8345975 queries asked about the same object 124 queries asked about the same alias set 0 access volatile 3811042 are dependent in the DAG 1551226 are aritificially in conflict with void * Modref stats: modref use: 6315 disambiguations, 60861 queries modref clobber: 1788334 disambiguations, 2706342 queries 1223123 tbaa querries (0.451947 per modref querry) PTA query stats: pt_solution_includes: 922059 disambiguations, 13392569 queries pt_solutions_intersect: 1018189 disambiguations, 11428599 queries This ignores the disambiguations enabled by the local modref done during early optimization. On cc1plus it seems that modref querries are cheap (usually converging after checking 0.45 alias sets :) and quite effective for stores with similar number of disambiguations as the PTA oracle (that is surprisingly high). Load disambiguations are quite low. Parlty this is becuase we only track lods that binds locally, since otherwise we risk that there is optimized out load that will be still present in prevailing version of the function and may lead to memory trap. Incrementally it may be worth to start tracking info about ealry optimized out loads, but that needs more work. Other problem I noticed while looking to dumps is that gimple, tree and rtl datastrutures are unions and we drop all accesses to them to alias set 0 that makes the anaysis not very effetive on GCC itself. This analysis of all those uninlined tree/rtl/gimple predicates quite ineffective that is bit sad. For tramp3d I get: Alias oracle query stats: refs_may_alias_p: 2261286 disambiguations, 2525487 queries ref_maybe_used_by_call_p: 7402 disambiguations, 2298244 queries call_may_clobber_ref_p: 232 disambiguations, 232 queries nonoverlapping_component_refs_p: 0 disambiguations, 4145 queries nonoverlapping_refs_since_match_p: 329 disambiguations, 10232 must overlaps, 10648 queries aliasing_component_refs_p: 858 disambiguations, 34731 queries TBAA oracle: 972943 disambiguations 1808294 queries 130565 are in alias set 0 497083 queries asked about the same object 0 queries asked about the same alias set 0 access volatile 207388 are dependent in the DAG 315 are aritificially in conflict with void * Modref stats: modref use: 950 disambiguations, 5474 queries modref clobber: 30035 disambiguations, 81948 queries 41671 tbaa querries (0.508505 per modref querry) PTA query stats: pt_solution_includes: 383095 disambiguations, 593201 queries pt_solutions_intersect: 147774 disambiguations, 434255 queries So again the loads are disambiguated less. There is a new testsuite failure: ./testsuite/gfortran/gfortran.sum:FAIL: gfortran.dg/assumed_type_9.f90 -O2 execution test ./testsuite/gfortran/gfortran.sum:FAIL: gfortran.dg/assumed_type_9.f90 -Os execution test This is an pre-existing issue where fortran frontend builds array descriptors in type that contains pointer to actual data while mixes them up with data type describing array of unknown type. The two structures are not TBAA compatible without LTO and thus leads to misoptimization. Same wrong code can be triggered by forcing inlining to happen and disabling ccp pass (that otherwise hides the problem), so I think we could handle this incrementally as independent bug - I would like to get basic code in so we can test additional patches in foreseeable future. While there are
New modref/ipa_modref optimization passes
Dear GCC developers, Below you may find a patch which implements two new optimization passes, called "modref" and "ipa_modref", which operate on GIMPLE and during WPO, respectively. What the passes do is fairly simple: for each analyzed function, they collect information about loads and stores performed by that function. The GIMPLE pass collects alias sets. The IPA pass collects sequences of types that make up each load/store. The IPA pass furthermore constructs a transitive closure from the data (so that we have information about loads/stores each function, and any function that it calls, performs). The data is collected in a data structure called a "modref tree". That's a really simple tree-like structure which is three levels deep; the first level is indexed by the base of the access, the second level is indexed by the ref of the access (using what seems to be standard GCC terminology for the innermost and outermost type of a reference) and the last is indexed by the range of the access (currently unused). The data structure's primary function is to organize the data and be able to quickly answer, given a load/store, whether any other load/store in the tree may alias it. The data structure has configurable limits on the size of each level and attempts to degrade gracefully when the limit is reached (i.e., it tries to keep as much information as possible without exceeding the limits). There's a lot of work to be done, but I believe it should mostly be refactoring and enhancing existing alias analysis by using the information which these new analyses collect. On the refactoring part: I need to rewrite the data structure to get rid of the C++ template. Since the data structure is used for both the GIMPLE pass and the IPA pass, which are indexed by alias sets and GIMPLE trees, respectively, a C++ template seemed appropriate. Wrong. Especially getting GC to work with a custom template was a nightmare, not to mention it brought lots of funky C++ peculiarities which I prefer to avoid. In the end, I couldn't avoid code duplication anyway, so some functions have a _tree and an _ipa version; I will address that as well. The integration with the alias oracle will be done incrementally over a series of patches which I will prepare with Honza Hubička. These shouldn't require extensive changes, so I hope they're fine as part of early stage3. No doubt, there will be some coding style issues and other violations of common sense which I have not yet learned. Last, I need to apologize. The patch worked with bootstrapping and passed all tests (except what was already broken in master). However, I've rebased at the last minute which was a poor judgement. With current master, I'm getting loads of warnings in stage2 with regard to unused variables and a few other diagnostics. I won't be able to fix that immediately as I need to discuss it with Honza first. It's therefore possible that I broke something with my the rebase and clean-ups. Please bear with me, I will fix this ASAP and send an updated version of the patch, which will bootstrap and pass all tests again with current master. To summarize, this patch implements a new analysis to collect information about loads/stores in functions. That can be used (and is used already, for the GIMPLE part) in the alias oracle. In theory, this should substantially improve alias analysis capabilities of GCC, leading to many more disambiguations in the alias oracle (because there will be much more information available about function's load/store refs). The patch needs a good deal of cleaning up and some further plumbing is required to get it to do some real work in the alias oracle. I think this is doable as part of stage3. I would very much like to see this in the upcoming release as I've spent quite some time with it. I'd like to thank Honza Hubička (my advisor) who was instrumental in putting this together. (But all bugs are mine.) Thanks go also to other guys from SUSE and others who were helping me to debug some peculiar issues with the garbage collector. Regards, David From 7d63fa301c12455aa6e9ba1b68f72e2d93fc2d2d Mon Sep 17 00:00:00 2001 From: David Cepelik Date: Fri, 11 Oct 2019 21:15:23 +0200 Subject: [PATCH] Modref pass --- ChangeLog| 20 + gcc/ChangeLog| 22 ++ gcc/Makefile.in |4 + gcc/gengtype.c |2 +- gcc/ipa-modref.c | 1012 ++ gcc/ipa-modref.h | 43 +++ gcc/lto-section-in.c |3 +- gcc/lto-streamer.h |1 + gcc/modref-tree.c| 342 + gcc/modref-tree.h| 355 ++ gcc/params.opt | 12 + gcc/passes.def |5 + gcc/timevar.def |2 + gcc/tree-pass.h |2 + gcc/tree-ssa-alias.c | 105 +- 15 files changed, 1924 insertions(+), 6 deletions(-) create mode 100644 gcc/ipa-modref.c create mode 100644 gcc/ipa-modref.h create mode