Re: Fully flow and context sensitive points-to analysis in GCC 4.6.0

Uday P. Khedker Thu, 11 Oct 2012 09:34:36 -0700


That's actually not true.  In fact existing GCC pointer analysis is
flow-sensitive for all SSA pointers.


SSA provides partial flow sensitivity to the top level pointers. For deeper
pointers, one needs to interleave SSA and points-to analysis. Besides, it cannot
handle global pointers which are important at the interprocedural level.

Points-to information is associated
with each (and only with!) SSA pointer variable via SSA_NAME_PTR_INFO.


I have enclosed a program fragment and dump fragment that shows
(a) flow insensitivity of GCC's points-to analysis
(b) points-to information for non-SSA variables.

At the same time, yes I reckon that the points-to information is encoded
in SSA_NAME_PTR_INFO and I wondered why the prefix SSA. Could you help
me by explaining the dump?

I suggest to look at the above and the disambiguators that use points-to
information in tree-ssa-alias.c (and their helpers in tree-ssa-struct-alias.c).


Since we are interested in interprocedural analysis, we would like to include
global variables and pointers at all levels of indirection. Hence we do not 
depend
on SSA variables. However, our analysis can become more efficient by separating 
SSA
names and other names.

I'd be interested to know how you analysis (is it interprocedural at all?)


Yes it is interprocedural and does not introduce approximations even in
the presence of recursion. Without interprocedural analysis, pointer analysis
is not much useful.

We use full call strings method (Sharir-Pnueli, 1981) for full flow and
context sensitivity but use value based termination (Khedker-Karkare, CC 2008)
that reduces the time and space by orders of magnitude (actually by a factor of 
a
million for recursive programs).

scales on programs the existing IPA PTA blows up (in both compile-time
and memory-usage, usually first memory-usage ;)).


You are right. As our paper shows, even with our value based termination,
the blow up is significant. Hence we factor in strong liveness. This reduces
the size of information dramatically without missing out on any useful 
information.
And there are more avenues of speeding up our analysis at the algorithmic level
(and I am not counting the use of better data structures which is the most
obvious thing to do.)

Did you investigate
on how to handle whole-program PTA at link-time with our distributed
WHOPR mode (thus at no point a single compiler process sees all
function bodies of the program but WPA phase sees the whole program
callgraph).


We have use LTO framework but did not use WHOPR mode because context sensitive
analysis is not possible for recursive programs by creating context independent
summaries by looking at one procedure at a time. Well, one can always compromise
on precision but we did not want to do that. So we use -flto-partition=none to
load all procedure bodies.

Thanks.

Uday.

-------------------------------------------------------------------------------------
Program fragment
-------------------------------------------------------------------------------------
#include <stdio.h>
int a, b, c, *e;
int main()
{

        if (a == b)
                e = &c;   /* statement n1 */
        else
                e = &b;   /* statement n2 */
        e = &a;           /* statement n3 */
        p();
}

p()
{
        printf ("%d", e);
}
----------------------------------------------------------------------------------------
Dump fragment. Note that flow sensitive analysis should say that
- e points only to c after n1,
- e points only to b after n2,
- e points only to c and b before n3,
- e points only a after n3.
However the analysis simply says that e points to a, b, c (ap0art from ESCAPED 
and NONLOCAL).
----------------------------------------------------------------------------------------
Points-to sets

NULL = { }
ANYTHING = { ANYTHING }
READONLY = { READONLY }
ESCAPED = { READONLY ESCAPED NONLOCAL a b c }
NONLOCAL = { ESCAPED NONLOCAL }
CALLUSED = { }
STOREDANYTHING = { }
INTEGER = { ANYTHING }
e.0_1 = same as e
e = { ESCAPED NONLOCAL a b c }
a.1_1 = { ESCAPED NONLOCAL }
a = same as a.1_1
b.2_2 = { ESCAPED NONLOCAL }
b = same as b.2_2
c = { ESCAPED NONLOCAL }

Re: Fully flow and context sensitive points-to analysis in GCC 4.6.0

Reply via email to