On Fri, Sep 25, 2020 at 9:05 AM Erick Ochoa <erick.oc...@theobroma-systems.com> wrote: > > Hi, > > I am working on an alias analysis using the points-to information > generated during IPA-PTA. If we look at the varmap varinfo_t array in > gcc/tree-ssa-struct.c, most of the constraint variable info structs > contain a non-null decl field which points to a valid tree in gimple > (which is an SSA variable and a pointer). I am trying to find out a way > to obtain points-to information for pointer expressions. By this, the > concrete example I have in mind is answering the following question: > > What does `astruct->aptrfield` points to? > > Here I have a concrete example: > > > #include <stdlib.h> > > struct A { char *f1; struct A *f2;}; > > int __GIMPLE(startwith("ipa-pta")) > main (int argc, char * * argv) > { > struct A * p1; > char * pc; > int i; > int _27; > > i_15 = 1; > pc = malloc(100); // HEAP(1) > p1 = malloc (16); // HEAP(2) > p1->f1 = pc; > p1->f2 = p1; > _27 = (int) 0; > return _27; > } > > > Will give the following correct points-to information: > > HEAP(1) = { } > HEAP(2) = { HEAP(1) HEAP(2) } > pc_30 = { HEAP(1) } > p1_32 = { HEAP(2) } > > However, there does not seem to be information printed for neither: > > p1->f1 > p1->f2 > > which I would expect (or like) something like: > > p1_32->0 = { HEAP(1) } > p1_32->64 = { HEAP(2) } > > Looking more closely at the problem, I found that some varinfo_t have a > non-null "complex" field. Which has an array of "complex" constraints > used to handle offsets and dereferences in gimple. For this same gimple > code, we have the following complex constraints for the variable p1_32: > > main.clobber = p1_32 + 64 > *p1_32 = pc_30 > *p1_32 + 64 = p1_32
The issue is that allocated storage is not tracked field-sensitive since we do not know it's layout at the point of allocation (where we allocate the HEAP variable). There are some exceptions, see what we do for by-reference parameters in create_variable_info_for_1: if (vi->only_restrict_pointers && !type_contains_placeholder_p (TREE_TYPE (decl_type)) && handle_param && !bitmap_bit_p (handled_struct_type, TYPE_UID (TREE_TYPE (decl_type)))) { varinfo_t rvi; tree heapvar = build_fake_var_decl (TREE_TYPE (decl_type)); DECL_EXTERNAL (heapvar) = 1; if (var_can_have_subvars (heapvar)) bitmap_set_bit (handled_struct_type, TYPE_UID (TREE_TYPE (decl_type))); rvi = create_variable_info_for_1 (heapvar, "PARM_NOALIAS", true, true, handled_struct_type); if (var_can_have_subvars (heapvar)) bitmap_clear_bit (handled_struct_type, TYPE_UID (TREE_TYPE (decl_type))); rvi->is_restrict_var = 1; insert_vi_for_tree (heapvar, rvi); make_constraint_from (vi, rvi->id); make_param_constraints (rvi); where we create a heapvarwith a specific aggregate type. Generally make_heapvar (for the allocation case) allocates a variable without subfields: static varinfo_t make_heapvar (const char *name, bool add_id) { varinfo_t vi; tree heapvar; heapvar = build_fake_var_decl (ptr_type_node); DECL_EXTERNAL (heapvar) = 1; vi = new_var_info (heapvar, name, add_id); vi->is_heap_var = true; vi->is_unknown_size_var = true; vi->offset = 0; vi->fullsize = ~0; vi->size = ~0; vi->is_full_var = true; I've once had attempted to split (aka generate subfields) a variable on-demand during solving but that never worked well. So for specific cases like C++ new T we could create heapvars appropriately typed. But you have to double-check for correctness because of may_have_pointers and so on. > It seems to me that I can probably parse these complex constraints to > generate the answers which I want. Is this the way this is currently > being handled in GCC or is there some other standard mechanism for this? GCC is in the end only interested in points-to sets for SSA names which never have subfields. The missing subfields for aggregates simply make the points-to solution less precise. Richard. > Thanks!