Re: [RFC] propagate malloc attribute in ipa-pure-const pass

Richard Biener Tue, 16 May 2017 04:42:13 -0700

On Mon, 15 May 2017, Prathamesh Kulkarni wrote:

> Hi,
> I have attached a bare-bones prototype patch that propagates malloc attribute 
> in
> ipa-pure-const. As far as I understand, from the doc a function could
> be annotated
> with malloc attribute if it returns a pointer to a newly allocated
> memory block (uninitialized or zeroed) and the pointer does not alias
> any other valid pointer ?
> 
> * Analysis
> The analysis used by the patch in malloc_candidate_p is too conservative,
> and I would be grateful for suggestions on improving it.
> Currently it only checks if:
> 1) The function returns a value of pointer type.
> 2) SSA_NAME_DEF_STMT (return_value) is either a function call or a phi, and
>     SSA_NAME_DEF_STMT(each element of phi) is function call.
> 3) The return-value has immediate uses only within comparisons (gcond
> or gassign) and return_stmt (and likewise a phi arg has immediate use only
> within comparison or the phi stmt).
>
> The intent is that malloc_candidate_p(node) returns true if node
> returns the return value of it's callee, and the return value is only
> used for comparisons within node.
> Then I assume it's safe (although conservative) to decide that node
> could possibly be a malloc function if callee is found to be malloc ?
> 
> for eg:
> void *f(size_t n)
> {
>   void *p = __builtin_malloc (n);
>   if (p == 0)
>     __builtin_abort ();
>   return p;
> }
> 
> malloc_candidate_p would determine f to be a candidate for malloc
> attribute since it returns the return-value of it's callee
> (__builtin_malloc) and the return value is used only within comparison
> and return_stmt.
> 
> However due to the imprecise analysis it misses this case:
> char  **g(size_t n)
> {
>   char **p = malloc (sizeof(char *) * 10);
>   for (int i = 0; i < 10; i++)
>     p[i] = malloc (sizeof(char) * n);
>   return p;
> }
> I suppose g() could be annotated with malloc here ?


It cannot because what p points to is initialized.  If you do then

 char **x = g(10);
 **x = 1;

will be optimized away.  Now what would be interesting is to do
"poor mans IPA PTA", namely identify functions that are a) small,
b) likely return a newly allocated pointer.  At PTA time then
"inline" all such wrappers, but only for the PTA constraints.

That will give more precise points-to results (and can handle more
cases, esp. initialization) than just annotating functions with 'malloc'.

+      /* With non-call exceptions we can't say for sure if other function
+        body was not possibly optimized to still throw.  */
+      if (!non_call || node->binds_to_current_def_p ())
+       {
+         DECL_IS_MALLOC (node->decl) = true;
+         *changed = true;

I don't see how malloc throwing would be an issue.

+  FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, retval)
+    if (gcond *cond = dyn_cast<gcond *> (use_stmt))
+      {
+       enum tree_code code = gimple_cond_code (cond);
+       if (TREE_CODE_CLASS (code) != tcc_comparison)

always false

+         RETURN_FROM_IMM_USE_STMT (use_iter, false);

I think you want to disallow eq/ne_expr against sth not integer_zerop.

+    else if (gassign *ga = dyn_cast<gassign *> (use_stmt))
+      {
+       enum tree_code code = gimple_assign_rhs_code (ga);
+       if (TREE_CODE_CLASS (code) != tcc_comparison)
+         RETURN_FROM_IMM_USE_STMT (use_iter, false);

likewise.

+static bool
+malloc_candidate_p (function *fun, vec<cgraph_node *>& callees)
+{
+  basic_block bb;
+  FOR_EACH_BB_FN (bb, fun)
+    {
+      for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
+          !gsi_end_p (gsi);
+          gsi_next (&gsi))
+       {
+         greturn *ret_stmt = dyn_cast<greturn *> (gsi_stmt (gsi));
+         if (ret_stmt)

I think you want do do this much cheaper by only walking the last
stmt of predecessor(s) of EXIT_BLOCK_FOR_FN.  (s) because
we have multiple return stmts for

int foo (int i)
{
  if (i)
    return;
  else
    return i;
}

but all return stmts will be the last stmt in one of the exit block
predecessors.

+             if (!check_retval_uses (retval, ret_stmt))
+               return false;
+
+             gimple *def = SSA_NAME_DEF_STMT (retval);
+             if (gcall *call_stmt = dyn_cast<gcall *> (def))
+               {
+                 tree callee_decl = gimple_call_fndecl (call_stmt);
+                 /* FIXME: In case of indirect call, callee_decl might be 
NULL
+                    Not sure what to do in that case, punting for now.  
*/
+                 if (!callee_decl)
+                   return false;
+                 cgraph_node *callee = cgraph_node::get_create 
(callee_decl);
+                 callees.safe_push (callee);
+               }
+             else if (gphi *phi = dyn_cast<gphi *> (def))
+               for (unsigned i = 0; i < gimple_phi_num_args (phi); ++i)
+                 {
+                   tree arg = gimple_phi_arg_def (phi, i);
+                   if (TREE_CODE (arg) != SSA_NAME)
+                     return false;
+                   if (!check_retval_uses (arg, phi))
+                     return false;
+

I think you should refactor this to a separate function and handle
recursion.  For recursion you miss simple SSA name assigns.

For PHI args you want to allow literal zero as well (for
-fdelete-null-pointer-checks).

> The patch uses return_calles_map which is a hash_map<node, callees> such
> that the return value of node is return value of one of these callees.
> For above eg it would be <f, [__builtin_malloc]>
> The analysis phase populates return_callees_map, and the propagation
> phase uses it to take the "meet" of callees.

I think you are supposed to treat functions optimistically as "malloc"
(use the DECL_IS_MALLOC flag) and during propagation revisit that
change by propagating across callgraph edges.  callgraph edges that are
relevant (calls feeding a pointer return value) should then be
marked somehow (set a flag you stream).

This also means that indirect calls should be only handled during
propagation.

Not sure if we have an example of a "edge encoder".  Honza?

Bonus points for auto-detecting alloc_size and alloc_align attributes
and warning with -Wsuggest-attribute=malloc{_size,_align,}.

> * LTO and memory management
> This is a general question about LTO and memory management.
> IIUC the following sequence takes place during normal LTO:
> LGEN: generate_summary, write_summary
> WPA: read_summary, execute ipa passes, write_opt_summary
> 
> So I assumed it was OK in LGEN to allocate return_callees_map in
> generate_summary and free it in write_summary and during WPA, allocate
> return_callees_map in read_summary and free it after execute (since
> write_opt_summary does not require return_callees_map).
> 
> However with fat LTO, it seems the sequence changes for LGEN with
> execute phase takes place after write_summary. However since
> return_callees_map is freed in pure_const_write_summary and
> propagate_malloc() accesses it in execute stage, it results in
> segmentation fault.
> 
> To work around this, I am using the following hack in 
> pure_const_write_summary:
> // FIXME: Do not free if -ffat-lto-objects is enabled.
> if (!global_options.x_flag_fat_lto_objects)
>   free_return_callees_map ();
> Is there a better approach for handling this ?
> 
> Also I am not sure if I have written cgraph_node::set_malloc_flag[_1] 
> correctly.
> I tried to imitate cgraph_node::set_const_flag.
> 
> The patch passes bootstrap+test on x86_64 and found a few functions in
> the source tree (attached func_names.txt) that could be annotated with
> malloc (I gave a brief look at some of the functions and didn't appear
> to be false positives but I will recheck thoroughly)
> 
> Does the patch look in the right direction ?
> I would be grateful for suggestions on improving it.
> 
> Thanks,
> Prathamesh
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [RFC] propagate malloc attribute in ipa-pure-const pass

Reply via email to