On Fri, Sep 25, 2020 at 9:05 AM Erick Ochoa
<erick.oc...@theobroma-systems.com> wrote:
>
> Hi,
>
> I am working on an alias analysis using the points-to information
> generated during IPA-PTA. If we look at the varmap varinfo_t array in
> gcc/tree-ssa-struct.c, most of the constraint variable info structs
> contain a non-null decl field which points to a valid tree in gimple
> (which is an SSA variable and a pointer). I am trying to find out a way
> to obtain points-to information for pointer expressions. By this, the
> concrete example I have in mind is answering the following question:
>
> What does `astruct->aptrfield` points to?
>
> Here I have a concrete example:
>
>
> #include <stdlib.h>
>
> struct A { char *f1; struct A *f2;};
>
> int __GIMPLE(startwith("ipa-pta"))
> main (int argc, char * * argv)
> {
> struct A * p1;
> char * pc;
> int i;
> int _27;
>
> i_15 = 1;
> pc = malloc(100); // HEAP(1)
> p1 = malloc (16); // HEAP(2)
> p1->f1 = pc;
> p1->f2 = p1;
> _27 = (int) 0;
> return _27;
> }
>
>
> Will give the following correct points-to information:
>
> HEAP(1) = { }
> HEAP(2) = { HEAP(1) HEAP(2) }
> pc_30 = { HEAP(1) }
> p1_32 = { HEAP(2) }
>
> However, there does not seem to be information printed for neither:
>
> p1->f1
> p1->f2
>
> which I would expect (or like) something like:
>
> p1_32->0 = { HEAP(1) }
> p1_32->64 = { HEAP(2) }
>
> Looking more closely at the problem, I found that some varinfo_t have a
> non-null "complex" field. Which has an array of "complex" constraints
> used to handle offsets and dereferences in gimple. For this same gimple
> code, we have the following complex constraints for the variable p1_32:
>
> main.clobber = p1_32 + 64
> *p1_32 = pc_30
> *p1_32 + 64 = p1_32
The issue is that allocated storage is not tracked field-sensitive since
we do not know it's layout at the point of allocation (where we allocate
the HEAP variable). There are some exceptions, see what we do
for by-reference parameters in create_variable_info_for_1:
if (vi->only_restrict_pointers
&& !type_contains_placeholder_p (TREE_TYPE (decl_type))
&& handle_param
&& !bitmap_bit_p (handled_struct_type,
TYPE_UID (TREE_TYPE (decl_type))))
{
varinfo_t rvi;
tree heapvar = build_fake_var_decl (TREE_TYPE (decl_type));
DECL_EXTERNAL (heapvar) = 1;
if (var_can_have_subvars (heapvar))
bitmap_set_bit (handled_struct_type,
TYPE_UID (TREE_TYPE (decl_type)));
rvi = create_variable_info_for_1 (heapvar, "PARM_NOALIAS", true,
true, handled_struct_type);
if (var_can_have_subvars (heapvar))
bitmap_clear_bit (handled_struct_type,
TYPE_UID (TREE_TYPE (decl_type)));
rvi->is_restrict_var = 1;
insert_vi_for_tree (heapvar, rvi);
make_constraint_from (vi, rvi->id);
make_param_constraints (rvi);
where we create a heapvarwith a specific aggregate type. Generally
make_heapvar (for the allocation case) allocates a variable without
subfields:
static varinfo_t
make_heapvar (const char *name, bool add_id)
{
varinfo_t vi;
tree heapvar;
heapvar = build_fake_var_decl (ptr_type_node);
DECL_EXTERNAL (heapvar) = 1;
vi = new_var_info (heapvar, name, add_id);
vi->is_heap_var = true;
vi->is_unknown_size_var = true;
vi->offset = 0;
vi->fullsize = ~0;
vi->size = ~0;
vi->is_full_var = true;
I've once had attempted to split (aka generate subfields) a variable
on-demand during solving but that never worked well.
So for specific cases like C++ new T we could create heapvars
appropriately typed. But you have to double-check for correctness
because of may_have_pointers and so on.
> It seems to me that I can probably parse these complex constraints to
> generate the answers which I want. Is this the way this is currently
> being handled in GCC or is there some other standard mechanism for this?
GCC is in the end only interested in points-to sets for SSA names
which never have subfields. The missing subfields for aggregates
simply make the points-to solution less precise.
Richard.
> Thanks!