Re: daily report on extending static analyzer project [GSoC]

David Malcolm via Gcc Tue, 06 Jul 2021 16:11:50 -0700

On Mon, 2021-07-05 at 21:45 +0530, Ankur Saini wrote:
> I forgot to send the daily report yesterday, so this one covers the
> work done on both days
> 
> AIM : 
> 
> - make the analyzer call the function with the updated call-string
> representation ( even the ones that doesn’t have a superedge )
> - make the analyzer figure out the point of return from the function
> called without the superedge
> - make the analyser figure out the correct point to return back in the
> caller function
> - make enode and eedge representing the return call
> - test the changes on the example I created before
> - speculate what GCC generates for a vfunc call and discuss how can we
> use it to our advantage
> 
> —
> 
> PROGRESS  ( changes can be seen on
> "refs/users/arsenic/heads/analyzer_extension “ branch of the repository
> ) :
> 
> - Thanks to the new call-string representation, I was able to push
> calls to the call stack which doesn’t have a superedge and was
> successfully able to see the calls happening via the function pointer.
> 
> - To detect the returning point of the function I used the fact that
> such supernodes would contain an EXIT bb, would not have any return
> superedge and would still have a pending call-stack. 
> 
> - Now the next part was to find out the destination node of the return,
> for this I again made use of the new call string and created a custom
> accessor to get the caller and callee supernodes of the return call,
> then I extracted the gcall* from the caller supernode to ulpdate the
> program state, 
> 
> - now that I have got next state and next point, it was time to put the
> final piece of puzzle together and create exploded node and edge
> representing the returning call.
> 
> - I tested the changes on the the following program where the analyzer
> was earlier giving a false negative due to not detecting call via a
> function pointer
> 
> ```
> #include <stdio.h>
> #include <stdlib.h>
> 
> void fun(int *int_ptr)
> {
>     free(int_ptr);
> }
> 
> int test()
> {
>     int *int_ptr = (int*)malloc(sizeof(int));
>     void (*fun_ptr)(int *) = &fun;
>     (*fun_ptr)(int_ptr);
> 
>     return 0;
> }
> 
> void test_2()
> {
>   test();
> }
> ```
> ( compiler explorer link : https://godbolt.org/z/9KfenGET9 <
> https://godbolt.org/z/9KfenGET9> )
> 
> and results were showing success where the analyzer was now able to
> successfully detect, call and return from the function that was called
> via the function pointer and no longer reported the memory leak it was
> reporting before. : )


This is great; well done!

It would be good to turn the above into a regression test.  I think you
can do that by simply adding it to gcc/testsuite/gcc.dg/analyzer.  You
could also add a case where fun_ptr is called twice, and check that it
reports it as a double-free (and add a dg-warning directive to verify
that it correctly complains).

I wonder if your branch has already have fixed:
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100546

> 
> - I think I should point this out, in the process I created a lot of
> custom function to access/alter some data which was not possible
> before.
> 
> - now that calls via function pointer are taken care of, it was time
> to see what exactly happen what GCC generates when a function is
> dispatched dynamically, and as planned earlier, I went to  ipa-
> devirt.c ( devirtualizer’s implementation of GCC ) to investigate.
> 
> - althogh I didn’t understood everything that was happening there but
> here are some of the findings I though might be interesting for the
> project :- 
>         > the polymorphic call is called with a OBJ_TYPE_REF which
> contains otr_type( a type of class whose method is called) and
> otr_token (the index into virtual table where address is taken)
>         > the devirtualizer builds a type inheritance graph to keep
> track of entire inheritance hierarchy
>         > the most interesting function I found was
> “possible_polymorphic_call_targets()” which returns the vector of all
> possible targets of polymorphic call represented by a calledge or a
> gcall.
>         > what I understood the devirtualizer do is to search in
> these polymorphic calls and filter out the the calls which are more
> likely to be called ( known as likely calls ) and then turn them into
> speculative calls which are later turned into direct calls.
> 
> - another thing I was curious to know was, how would analyzer behave
> when encountered with a polymorphic call now that we are splitting
> the superedges at every call. 
> 
> the results were interesting, I was able to see analyzer splitting
> supernodes for the calls right away but this time they were not
> connected via a intraprocedural edge making the analyzer crashing at
> the callsite ( I would look more into it tomorrow ) 
> 
> the example I used was : -
> ```
> struct A
> {
>     virtual int foo (void) 
>     {
>         return 42;
>     }
> };
> 
> struct B: public A
> {
>   int foo (void) 
>     { 
>         return 0;
>     }
> };
> 
> int test()
> {
>     struct B b, *bptr=&b;
>     bptr->foo();
>     return bptr->foo();
> }
> ```
> ( compiler explorer link : https://godbolt.org/z/d986ab7MY < 
> https://godbolt.org/z/d986ab7MY> )
> 

I can see the crash in gdb:

In state_purge_per_ssa_name::process_point, when
  if (snode->m_returning_call)
the code assumes that there will a cgraph_edge, which isn't the case
anymore; it will need to go from the "return" supernode to the "call"
supernode (both within the caller function).


> —
> 
> STATUS AT THE END OF THE DAY :- 
> 
> - make the analyzer call the function with the updated call-string
> representation ( even the ones that doesn’t have a superedge ) (done)
> - make the analyzer figure out the point of return from the function
> called without the superedge (done)
> - make the analyser figure out the correct point to return back in
> the caller function (done)
> - make enode and eedge representing the return call (done)
> - test the changes on the example I created before (done)
> - speculate what GCC generates for a vfunc call and discuss how can
> we use it to our advantage (done)
> 

Good work; looks promising.
Dave

Re: daily report on extending static analyzer project [GSoC]

Reply via email to