That's actually not true. In fact existing GCC pointer analysis is flow-sensitive for all SSA pointers.
SSA provides partial flow sensitivity to the top level pointers. For deeper pointers, one needs to interleave SSA and points-to analysis. Besides, it cannot handle global pointers which are important at the interprocedural level.
Points-to information is associated with each (and only with!) SSA pointer variable via SSA_NAME_PTR_INFO.
I have enclosed a program fragment and dump fragment that shows (a) flow insensitivity of GCC's points-to analysis (b) points-to information for non-SSA variables. At the same time, yes I reckon that the points-to information is encoded in SSA_NAME_PTR_INFO and I wondered why the prefix SSA. Could you help me by explaining the dump?
I suggest to look at the above and the disambiguators that use points-to information in tree-ssa-alias.c (and their helpers in tree-ssa-struct-alias.c).
Since we are interested in interprocedural analysis, we would like to include global variables and pointers at all levels of indirection. Hence we do not depend on SSA variables. However, our analysis can become more efficient by separating SSA names and other names.
I'd be interested to know how you analysis (is it interprocedural at all?)
Yes it is interprocedural and does not introduce approximations even in the presence of recursion. Without interprocedural analysis, pointer analysis is not much useful. We use full call strings method (Sharir-Pnueli, 1981) for full flow and context sensitivity but use value based termination (Khedker-Karkare, CC 2008) that reduces the time and space by orders of magnitude (actually by a factor of a million for recursive programs).
scales on programs the existing IPA PTA blows up (in both compile-time and memory-usage, usually first memory-usage ;)).
You are right. As our paper shows, even with our value based termination, the blow up is significant. Hence we factor in strong liveness. This reduces the size of information dramatically without missing out on any useful information. And there are more avenues of speeding up our analysis at the algorithmic level (and I am not counting the use of better data structures which is the most obvious thing to do.)
Did you investigate on how to handle whole-program PTA at link-time with our distributed WHOPR mode (thus at no point a single compiler process sees all function bodies of the program but WPA phase sees the whole program callgraph).
We have use LTO framework but did not use WHOPR mode because context sensitive analysis is not possible for recursive programs by creating context independent summaries by looking at one procedure at a time. Well, one can always compromise on precision but we did not want to do that. So we use -flto-partition=none to load all procedure bodies. Thanks. Uday. ------------------------------------------------------------------------------------- Program fragment ------------------------------------------------------------------------------------- #include <stdio.h> int a, b, c, *e; int main() { if (a == b) e = &c; /* statement n1 */ else e = &b; /* statement n2 */ e = &a; /* statement n3 */ p(); } p() { printf ("%d", e); } ---------------------------------------------------------------------------------------- Dump fragment. Note that flow sensitive analysis should say that - e points only to c after n1, - e points only to b after n2, - e points only to c and b before n3, - e points only a after n3. However the analysis simply says that e points to a, b, c (ap0art from ESCAPED and NONLOCAL). ---------------------------------------------------------------------------------------- Points-to sets NULL = { } ANYTHING = { ANYTHING } READONLY = { READONLY } ESCAPED = { READONLY ESCAPED NONLOCAL a b c } NONLOCAL = { ESCAPED NONLOCAL } CALLUSED = { } STOREDANYTHING = { } INTEGER = { ANYTHING } e.0_1 = same as e e = { ESCAPED NONLOCAL a b c } a.1_1 = { ESCAPED NONLOCAL } a = same as a.1_1 b.2_2 = { ESCAPED NONLOCAL } b = same as b.2_2 c = { ESCAPED NONLOCAL }