On Mon, 15 May 2017, Prathamesh Kulkarni wrote: > Hi, > I have attached a bare-bones prototype patch that propagates malloc attribute > in > ipa-pure-const. As far as I understand, from the doc a function could > be annotated > with malloc attribute if it returns a pointer to a newly allocated > memory block (uninitialized or zeroed) and the pointer does not alias > any other valid pointer ? > > * Analysis > The analysis used by the patch in malloc_candidate_p is too conservative, > and I would be grateful for suggestions on improving it. > Currently it only checks if: > 1) The function returns a value of pointer type. > 2) SSA_NAME_DEF_STMT (return_value) is either a function call or a phi, and > SSA_NAME_DEF_STMT(each element of phi) is function call. > 3) The return-value has immediate uses only within comparisons (gcond > or gassign) and return_stmt (and likewise a phi arg has immediate use only > within comparison or the phi stmt). > > The intent is that malloc_candidate_p(node) returns true if node > returns the return value of it's callee, and the return value is only > used for comparisons within node. > Then I assume it's safe (although conservative) to decide that node > could possibly be a malloc function if callee is found to be malloc ? > > for eg: > void *f(size_t n) > { > void *p = __builtin_malloc (n); > if (p == 0) > __builtin_abort (); > return p; > } > > malloc_candidate_p would determine f to be a candidate for malloc > attribute since it returns the return-value of it's callee > (__builtin_malloc) and the return value is used only within comparison > and return_stmt. > > However due to the imprecise analysis it misses this case: > char **g(size_t n) > { > char **p = malloc (sizeof(char *) * 10); > for (int i = 0; i < 10; i++) > p[i] = malloc (sizeof(char) * n); > return p; > } > I suppose g() could be annotated with malloc here ?
It cannot because what p points to is initialized. If you do then char **x = g(10); **x = 1; will be optimized away. Now what would be interesting is to do "poor mans IPA PTA", namely identify functions that are a) small, b) likely return a newly allocated pointer. At PTA time then "inline" all such wrappers, but only for the PTA constraints. That will give more precise points-to results (and can handle more cases, esp. initialization) than just annotating functions with 'malloc'. + /* With non-call exceptions we can't say for sure if other function + body was not possibly optimized to still throw. */ + if (!non_call || node->binds_to_current_def_p ()) + { + DECL_IS_MALLOC (node->decl) = true; + *changed = true; I don't see how malloc throwing would be an issue. + FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, retval) + if (gcond *cond = dyn_cast<gcond *> (use_stmt)) + { + enum tree_code code = gimple_cond_code (cond); + if (TREE_CODE_CLASS (code) != tcc_comparison) always false + RETURN_FROM_IMM_USE_STMT (use_iter, false); I think you want to disallow eq/ne_expr against sth not integer_zerop. + else if (gassign *ga = dyn_cast<gassign *> (use_stmt)) + { + enum tree_code code = gimple_assign_rhs_code (ga); + if (TREE_CODE_CLASS (code) != tcc_comparison) + RETURN_FROM_IMM_USE_STMT (use_iter, false); likewise. +static bool +malloc_candidate_p (function *fun, vec<cgraph_node *>& callees) +{ + basic_block bb; + FOR_EACH_BB_FN (bb, fun) + { + for (gimple_stmt_iterator gsi = gsi_start_bb (bb); + !gsi_end_p (gsi); + gsi_next (&gsi)) + { + greturn *ret_stmt = dyn_cast<greturn *> (gsi_stmt (gsi)); + if (ret_stmt) I think you want do do this much cheaper by only walking the last stmt of predecessor(s) of EXIT_BLOCK_FOR_FN. (s) because we have multiple return stmts for int foo (int i) { if (i) return; else return i; } but all return stmts will be the last stmt in one of the exit block predecessors. + if (!check_retval_uses (retval, ret_stmt)) + return false; + + gimple *def = SSA_NAME_DEF_STMT (retval); + if (gcall *call_stmt = dyn_cast<gcall *> (def)) + { + tree callee_decl = gimple_call_fndecl (call_stmt); + /* FIXME: In case of indirect call, callee_decl might be NULL + Not sure what to do in that case, punting for now. */ + if (!callee_decl) + return false; + cgraph_node *callee = cgraph_node::get_create (callee_decl); + callees.safe_push (callee); + } + else if (gphi *phi = dyn_cast<gphi *> (def)) + for (unsigned i = 0; i < gimple_phi_num_args (phi); ++i) + { + tree arg = gimple_phi_arg_def (phi, i); + if (TREE_CODE (arg) != SSA_NAME) + return false; + if (!check_retval_uses (arg, phi)) + return false; + I think you should refactor this to a separate function and handle recursion. For recursion you miss simple SSA name assigns. For PHI args you want to allow literal zero as well (for -fdelete-null-pointer-checks). > The patch uses return_calles_map which is a hash_map<node, callees> such > that the return value of node is return value of one of these callees. > For above eg it would be <f, [__builtin_malloc]> > The analysis phase populates return_callees_map, and the propagation > phase uses it to take the "meet" of callees. I think you are supposed to treat functions optimistically as "malloc" (use the DECL_IS_MALLOC flag) and during propagation revisit that change by propagating across callgraph edges. callgraph edges that are relevant (calls feeding a pointer return value) should then be marked somehow (set a flag you stream). This also means that indirect calls should be only handled during propagation. Not sure if we have an example of a "edge encoder". Honza? Bonus points for auto-detecting alloc_size and alloc_align attributes and warning with -Wsuggest-attribute=malloc{_size,_align,}. > * LTO and memory management > This is a general question about LTO and memory management. > IIUC the following sequence takes place during normal LTO: > LGEN: generate_summary, write_summary > WPA: read_summary, execute ipa passes, write_opt_summary > > So I assumed it was OK in LGEN to allocate return_callees_map in > generate_summary and free it in write_summary and during WPA, allocate > return_callees_map in read_summary and free it after execute (since > write_opt_summary does not require return_callees_map). > > However with fat LTO, it seems the sequence changes for LGEN with > execute phase takes place after write_summary. However since > return_callees_map is freed in pure_const_write_summary and > propagate_malloc() accesses it in execute stage, it results in > segmentation fault. > > To work around this, I am using the following hack in > pure_const_write_summary: > // FIXME: Do not free if -ffat-lto-objects is enabled. > if (!global_options.x_flag_fat_lto_objects) > free_return_callees_map (); > Is there a better approach for handling this ? > > Also I am not sure if I have written cgraph_node::set_malloc_flag[_1] > correctly. > I tried to imitate cgraph_node::set_const_flag. > > The patch passes bootstrap+test on x86_64 and found a few functions in > the source tree (attached func_names.txt) that could be annotated with > malloc (I gave a brief look at some of the functions and didn't appear > to be false positives but I will recheck thoroughly) > > Does the patch look in the right direction ? > I would be grateful for suggestions on improving it. > > Thanks, > Prathamesh > -- Richard Biener <rguent...@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)