Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-10-27 Thread Richard Biener
On Fri, 27 Oct 2017, Jan Hubicka wrote:

> > On 25 October 2017 at 20:44, Jan Hubicka  wrote:
> > >> On 24 October 2017 at 16:26, Jan Hubicka  wrote:
> > >> >> 2017-10-13  Prathamesh Kulkarni  
> > >> >>
> > >> >>   * cgraph.h (set_malloc_flag): Declare.
> > >> >>   * cgraph.c (set_malloc_flag_1): New function.
> > >> >>   (set_malloc_flag): Likewise.
> > >> >>   * ipa-fnsummary.h (ipa_call_summary): Add new field 
> > >> >> is_return_callee.
> > >> >>   * ipa-fnsummary.c (ipa_call_summary::reset): Set 
> > >> >> is_return_callee to
> > >> >>   false.
> > >> >>   (read_ipa_call_summary): Add support for reading 
> > >> >> is_return_callee.
> > >> >>   (write_ipa_call_summary): Stream is_return_callee.
> > >> >>   * ipa-inline.c (ipa_inline): Remove call to ipa_free_fn_summary.
> > >> >>   * ipa-pure-const.c: Add headers ssa.h, alloc-pool.h, 
> > >> >> symbol-summary.h,
> > >> >>   ipa-prop.h, ipa-fnsummary.h.
> > >> >>   (pure_const_names): Change to static.
> > >> >>   (malloc_state_e): Define.
> > >> >>   (malloc_state_names): Define.
> > >> >>   (funct_state_d): Add field malloc_state.
> > >> >>   (varying_state): Set malloc_state to STATE_MALLOC_BOTTOM.
> > >> >>   (check_retval_uses): New function.
> > >> >>   (malloc_candidate_p): Likewise.
> > >> >>   (analyze_function): Add support for malloc attribute.
> > >> >>   (pure_const_write_summary): Stream malloc_state.
> > >> >>   (pure_const_read_summary): Add support for reading malloc_state.
> > >> >>   (dump_malloc_lattice): New function.
> > >> >>   (propagate_malloc): New function.
> > >> >>   (ipa_pure_const::execute): Call propagate_malloc and
> > >> >>   ipa_free_fn_summary.
> > >> >>   (pass_local_pure_const::execute): Add support for malloc 
> > >> >> attribute.
> > >> >>   * ssa-iterators.h (RETURN_FROM_IMM_USE_STMT): New macro.
> > >> >>
> > >> >> testsuite/
> > >> >>   * gcc.dg/ipa/propmalloc-1.c: New test-case.
> > >> >>   * gcc.dg/ipa/propmalloc-2.c: Likewise.
> > >> >>   * gcc.dg/ipa/propmalloc-3.c: Likewise.
> > >> >
> > >> > OK.
> > >> > Perhaps we could also add -Wsuggest-sttribute=malloc and mention it in 
> > >> > changes.html?
> > >> Done in this version.
> > >> In warn_function_malloc(), I passed false for known_finite param to
> > >> suggest_attribute().
> > >> Does that look OK ?
> > >> Validation in progress. OK to commit if passes ?
> > >
> > > OK, thanks!
> > Thanks, committed as r254140 after following validation:
> > 1] Bootstrap+test with --enable-languages=all,ada,go on
> > x86_64-unknown-linux-gnu and ppc64le-linux-gnu.
> > 2] LTO bootstrap+test on x86_64-unknown-linux-gnu and ppc64le-linux-gnu
> > 3] Cross tested on arm*-*-* and aarch64*-*-*.
> > 
> > Would it be a good idea to extend ipa-pure-const to propagate
> > alloc_size/alloc_align and returns_nonnull attributes ?
> > Which other attributes would be useful to propagate in ipa-pure-const ?
> 
> Also one extension I was considering was TBAA mod-ref pass. I.e. propagate 
> what
> types are read/stored by the call, rather than having just pure/const (no 
> stores,
> no reads at all).
> 
> This will be bit fun to implement in IPA, but it may be useful.  If you would
> be interested in looking into this, we can discuss it (I wanted to implement 
> it
> this stage1 but I think I have way too many other plans).
> 
> LLVM also has nocapture that seems useful for PTA. Richi may have useful
> opinions on that ;)

I once tried to prototype "fn spec" attribute autodetection and
IPA propagation (also into IPA pure const).  Didn't get very far
though.  That tracks per function argument properties like whether
memory is written to through this pointer or whether a pointer
possibly escapes from the function (including through its returned
value).

Richard.


Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-10-27 Thread Jan Hubicka
> On 25 October 2017 at 20:44, Jan Hubicka  wrote:
> >> On 24 October 2017 at 16:26, Jan Hubicka  wrote:
> >> >> 2017-10-13  Prathamesh Kulkarni  
> >> >>
> >> >>   * cgraph.h (set_malloc_flag): Declare.
> >> >>   * cgraph.c (set_malloc_flag_1): New function.
> >> >>   (set_malloc_flag): Likewise.
> >> >>   * ipa-fnsummary.h (ipa_call_summary): Add new field 
> >> >> is_return_callee.
> >> >>   * ipa-fnsummary.c (ipa_call_summary::reset): Set is_return_callee 
> >> >> to
> >> >>   false.
> >> >>   (read_ipa_call_summary): Add support for reading is_return_callee.
> >> >>   (write_ipa_call_summary): Stream is_return_callee.
> >> >>   * ipa-inline.c (ipa_inline): Remove call to ipa_free_fn_summary.
> >> >>   * ipa-pure-const.c: Add headers ssa.h, alloc-pool.h, 
> >> >> symbol-summary.h,
> >> >>   ipa-prop.h, ipa-fnsummary.h.
> >> >>   (pure_const_names): Change to static.
> >> >>   (malloc_state_e): Define.
> >> >>   (malloc_state_names): Define.
> >> >>   (funct_state_d): Add field malloc_state.
> >> >>   (varying_state): Set malloc_state to STATE_MALLOC_BOTTOM.
> >> >>   (check_retval_uses): New function.
> >> >>   (malloc_candidate_p): Likewise.
> >> >>   (analyze_function): Add support for malloc attribute.
> >> >>   (pure_const_write_summary): Stream malloc_state.
> >> >>   (pure_const_read_summary): Add support for reading malloc_state.
> >> >>   (dump_malloc_lattice): New function.
> >> >>   (propagate_malloc): New function.
> >> >>   (ipa_pure_const::execute): Call propagate_malloc and
> >> >>   ipa_free_fn_summary.
> >> >>   (pass_local_pure_const::execute): Add support for malloc 
> >> >> attribute.
> >> >>   * ssa-iterators.h (RETURN_FROM_IMM_USE_STMT): New macro.
> >> >>
> >> >> testsuite/
> >> >>   * gcc.dg/ipa/propmalloc-1.c: New test-case.
> >> >>   * gcc.dg/ipa/propmalloc-2.c: Likewise.
> >> >>   * gcc.dg/ipa/propmalloc-3.c: Likewise.
> >> >
> >> > OK.
> >> > Perhaps we could also add -Wsuggest-sttribute=malloc and mention it in 
> >> > changes.html?
> >> Done in this version.
> >> In warn_function_malloc(), I passed false for known_finite param to
> >> suggest_attribute().
> >> Does that look OK ?
> >> Validation in progress. OK to commit if passes ?
> >
> > OK, thanks!
> Thanks, committed as r254140 after following validation:
> 1] Bootstrap+test with --enable-languages=all,ada,go on
> x86_64-unknown-linux-gnu and ppc64le-linux-gnu.
> 2] LTO bootstrap+test on x86_64-unknown-linux-gnu and ppc64le-linux-gnu
> 3] Cross tested on arm*-*-* and aarch64*-*-*.
> 
> Would it be a good idea to extend ipa-pure-const to propagate
> alloc_size/alloc_align and returns_nonnull attributes ?
> Which other attributes would be useful to propagate in ipa-pure-const ?

Also one extension I was considering was TBAA mod-ref pass. I.e. propagate what
types are read/stored by the call, rather than having just pure/const (no 
stores,
no reads at all).

This will be bit fun to implement in IPA, but it may be useful.  If you would
be interested in looking into this, we can discuss it (I wanted to implement it
this stage1 but I think I have way too many other plans).

LLVM also has nocapture that seems useful for PTA. Richi may have useful
opinions on that ;)

Honza


Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-10-27 Thread Jan Hubicka
Prathamesh
> > OK, thanks!
> Thanks, committed as r254140 after following validation:
> 1] Bootstrap+test with --enable-languages=all,ada,go on
> x86_64-unknown-linux-gnu and ppc64le-linux-gnu.
> 2] LTO bootstrap+test on x86_64-unknown-linux-gnu and ppc64le-linux-gnu
> 3] Cross tested on arm*-*-* and aarch64*-*-*.
> 
> Would it be a good idea to extend ipa-pure-const to propagate
> alloc_size/alloc_align and returns_nonnull attributes ?
> Which other attributes would be useful to propagate in ipa-pure-const ?

Good I did not discourage you by slow review rate (will try to do something
about it).
I guess alloc_size/alloc_align are obvious candidates.

returns_nonnull is a special case of VRP over returns so I would rather
like to see ipa-prop/ipa-cp extended to handle return values and this done
as a consequence.

One interesting case I think we could try to track is what types of exceptions
are thrown.  Currently if function throws specific exception which is handled
by caller, we still think caller may throw because we do not take types into
consideration at all.  I think that may mark significant portion of functions
nothrow as this seems like common coding practice.

It would be also nice to cleanup ipa-pure-const.  I think one can implement
propagation template where one just feeds the basic parameters (what data
to store, what edges to consider) rahter than copying the rather outdated
code again and again.
> 
> Also, I would be grateful for suggestions on how to propagate malloc
> attribute through indirect calls.
> Currently, I have left it as FIXME, and simply drop the lattice to
> MALLOC_BOTTOM if there's an indirect call :(
> 
> I am not able to understand how attribute propagation across indirect
> calls works.
> For example consider propagation of nothrow attribute in following test-case:
> 
> __attribute__((noinline, noclone, nothrow))
> int f1(int k) { return k; }
> 
> __attribute__((noinline, noclone))
> static int foo(int (*p)(int))
> {
>   return p(10);
> }
> 
> __attribute__((noinline, noclone))
> int bar(void)
> {
>   return foo(f1);
> }
> 
> Shouldn't foo and bar be also marked as nothrow ?
> Since foo indirectly calls f1 which is nothrow and bar only calls foo ?

Well, foo may have other uses where it calls something else than f1
so one needs to track "nothrow in the context when f1 is called".

In general I think this reduces to may edges in the callgraph (for given
indirect calls we want to track whether we know complete list of possible
targets).  We do that for polymorphic calls via ipa-polymorphic-call analysis
but I did not get around implementing anything for indirect calls.

To ge the list of targets, you call possible_polymorphic_call_targets which
also tells you whether the list is complete (final flag) or whether there may
be some callees invisible to compiler (such as derrivate types from other
compilation unit).

The lists may be large and for that reason there is cache token which tells
you if you see same list again.

Extending ipa-pure-const to work across final lists of polymorphic targets
may be first step to handle indirect calls in general.

Honza
> 
> The local-pure-const2 dump shows function is locally throwing  for
> "foo" and "bar", and ipa-pure-const dump doesn't appear to show foo and bar
> marked as nothrow.
> 
> Thanks,
> Prathamesh
> > Honza


Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-10-27 Thread Prathamesh Kulkarni
On 25 October 2017 at 20:44, Jan Hubicka  wrote:
>> On 24 October 2017 at 16:26, Jan Hubicka  wrote:
>> >> 2017-10-13  Prathamesh Kulkarni  
>> >>
>> >>   * cgraph.h (set_malloc_flag): Declare.
>> >>   * cgraph.c (set_malloc_flag_1): New function.
>> >>   (set_malloc_flag): Likewise.
>> >>   * ipa-fnsummary.h (ipa_call_summary): Add new field 
>> >> is_return_callee.
>> >>   * ipa-fnsummary.c (ipa_call_summary::reset): Set is_return_callee to
>> >>   false.
>> >>   (read_ipa_call_summary): Add support for reading is_return_callee.
>> >>   (write_ipa_call_summary): Stream is_return_callee.
>> >>   * ipa-inline.c (ipa_inline): Remove call to ipa_free_fn_summary.
>> >>   * ipa-pure-const.c: Add headers ssa.h, alloc-pool.h, 
>> >> symbol-summary.h,
>> >>   ipa-prop.h, ipa-fnsummary.h.
>> >>   (pure_const_names): Change to static.
>> >>   (malloc_state_e): Define.
>> >>   (malloc_state_names): Define.
>> >>   (funct_state_d): Add field malloc_state.
>> >>   (varying_state): Set malloc_state to STATE_MALLOC_BOTTOM.
>> >>   (check_retval_uses): New function.
>> >>   (malloc_candidate_p): Likewise.
>> >>   (analyze_function): Add support for malloc attribute.
>> >>   (pure_const_write_summary): Stream malloc_state.
>> >>   (pure_const_read_summary): Add support for reading malloc_state.
>> >>   (dump_malloc_lattice): New function.
>> >>   (propagate_malloc): New function.
>> >>   (ipa_pure_const::execute): Call propagate_malloc and
>> >>   ipa_free_fn_summary.
>> >>   (pass_local_pure_const::execute): Add support for malloc attribute.
>> >>   * ssa-iterators.h (RETURN_FROM_IMM_USE_STMT): New macro.
>> >>
>> >> testsuite/
>> >>   * gcc.dg/ipa/propmalloc-1.c: New test-case.
>> >>   * gcc.dg/ipa/propmalloc-2.c: Likewise.
>> >>   * gcc.dg/ipa/propmalloc-3.c: Likewise.
>> >
>> > OK.
>> > Perhaps we could also add -Wsuggest-sttribute=malloc and mention it in 
>> > changes.html?
>> Done in this version.
>> In warn_function_malloc(), I passed false for known_finite param to
>> suggest_attribute().
>> Does that look OK ?
>> Validation in progress. OK to commit if passes ?
>
> OK, thanks!
Thanks, committed as r254140 after following validation:
1] Bootstrap+test with --enable-languages=all,ada,go on
x86_64-unknown-linux-gnu and ppc64le-linux-gnu.
2] LTO bootstrap+test on x86_64-unknown-linux-gnu and ppc64le-linux-gnu
3] Cross tested on arm*-*-* and aarch64*-*-*.

Would it be a good idea to extend ipa-pure-const to propagate
alloc_size/alloc_align and returns_nonnull attributes ?
Which other attributes would be useful to propagate in ipa-pure-const ?

Also, I would be grateful for suggestions on how to propagate malloc
attribute through indirect calls.
Currently, I have left it as FIXME, and simply drop the lattice to
MALLOC_BOTTOM if there's an indirect call :(

I am not able to understand how attribute propagation across indirect
calls works.
For example consider propagation of nothrow attribute in following test-case:

__attribute__((noinline, noclone, nothrow))
int f1(int k) { return k; }

__attribute__((noinline, noclone))
static int foo(int (*p)(int))
{
  return p(10);
}

__attribute__((noinline, noclone))
int bar(void)
{
  return foo(f1);
}

Shouldn't foo and bar be also marked as nothrow ?
Since foo indirectly calls f1 which is nothrow and bar only calls foo ?

The local-pure-const2 dump shows function is locally throwing  for
"foo" and "bar", and ipa-pure-const dump doesn't appear to show foo and bar
marked as nothrow.

Thanks,
Prathamesh
> Honza


Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-10-25 Thread Jan Hubicka
> On 24 October 2017 at 16:26, Jan Hubicka  wrote:
> >> 2017-10-13  Prathamesh Kulkarni  
> >>
> >>   * cgraph.h (set_malloc_flag): Declare.
> >>   * cgraph.c (set_malloc_flag_1): New function.
> >>   (set_malloc_flag): Likewise.
> >>   * ipa-fnsummary.h (ipa_call_summary): Add new field is_return_callee.
> >>   * ipa-fnsummary.c (ipa_call_summary::reset): Set is_return_callee to
> >>   false.
> >>   (read_ipa_call_summary): Add support for reading is_return_callee.
> >>   (write_ipa_call_summary): Stream is_return_callee.
> >>   * ipa-inline.c (ipa_inline): Remove call to ipa_free_fn_summary.
> >>   * ipa-pure-const.c: Add headers ssa.h, alloc-pool.h, 
> >> symbol-summary.h,
> >>   ipa-prop.h, ipa-fnsummary.h.
> >>   (pure_const_names): Change to static.
> >>   (malloc_state_e): Define.
> >>   (malloc_state_names): Define.
> >>   (funct_state_d): Add field malloc_state.
> >>   (varying_state): Set malloc_state to STATE_MALLOC_BOTTOM.
> >>   (check_retval_uses): New function.
> >>   (malloc_candidate_p): Likewise.
> >>   (analyze_function): Add support for malloc attribute.
> >>   (pure_const_write_summary): Stream malloc_state.
> >>   (pure_const_read_summary): Add support for reading malloc_state.
> >>   (dump_malloc_lattice): New function.
> >>   (propagate_malloc): New function.
> >>   (ipa_pure_const::execute): Call propagate_malloc and
> >>   ipa_free_fn_summary.
> >>   (pass_local_pure_const::execute): Add support for malloc attribute.
> >>   * ssa-iterators.h (RETURN_FROM_IMM_USE_STMT): New macro.
> >>
> >> testsuite/
> >>   * gcc.dg/ipa/propmalloc-1.c: New test-case.
> >>   * gcc.dg/ipa/propmalloc-2.c: Likewise.
> >>   * gcc.dg/ipa/propmalloc-3.c: Likewise.
> >
> > OK.
> > Perhaps we could also add -Wsuggest-sttribute=malloc and mention it in 
> > changes.html?
> Done in this version.
> In warn_function_malloc(), I passed false for known_finite param to
> suggest_attribute().
> Does that look OK ?
> Validation in progress. OK to commit if passes ?

OK, thanks!
Honza


Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-10-25 Thread Prathamesh Kulkarni
On 24 October 2017 at 16:26, Jan Hubicka  wrote:
>> 2017-10-13  Prathamesh Kulkarni  
>>
>>   * cgraph.h (set_malloc_flag): Declare.
>>   * cgraph.c (set_malloc_flag_1): New function.
>>   (set_malloc_flag): Likewise.
>>   * ipa-fnsummary.h (ipa_call_summary): Add new field is_return_callee.
>>   * ipa-fnsummary.c (ipa_call_summary::reset): Set is_return_callee to
>>   false.
>>   (read_ipa_call_summary): Add support for reading is_return_callee.
>>   (write_ipa_call_summary): Stream is_return_callee.
>>   * ipa-inline.c (ipa_inline): Remove call to ipa_free_fn_summary.
>>   * ipa-pure-const.c: Add headers ssa.h, alloc-pool.h, symbol-summary.h,
>>   ipa-prop.h, ipa-fnsummary.h.
>>   (pure_const_names): Change to static.
>>   (malloc_state_e): Define.
>>   (malloc_state_names): Define.
>>   (funct_state_d): Add field malloc_state.
>>   (varying_state): Set malloc_state to STATE_MALLOC_BOTTOM.
>>   (check_retval_uses): New function.
>>   (malloc_candidate_p): Likewise.
>>   (analyze_function): Add support for malloc attribute.
>>   (pure_const_write_summary): Stream malloc_state.
>>   (pure_const_read_summary): Add support for reading malloc_state.
>>   (dump_malloc_lattice): New function.
>>   (propagate_malloc): New function.
>>   (ipa_pure_const::execute): Call propagate_malloc and
>>   ipa_free_fn_summary.
>>   (pass_local_pure_const::execute): Add support for malloc attribute.
>>   * ssa-iterators.h (RETURN_FROM_IMM_USE_STMT): New macro.
>>
>> testsuite/
>>   * gcc.dg/ipa/propmalloc-1.c: New test-case.
>>   * gcc.dg/ipa/propmalloc-2.c: Likewise.
>>   * gcc.dg/ipa/propmalloc-3.c: Likewise.
>
> OK.
> Perhaps we could also add -Wsuggest-sttribute=malloc and mention it in 
> changes.html?
Done in this version.
In warn_function_malloc(), I passed false for known_finite param to
suggest_attribute().
Does that look OK ?
Validation in progress. OK to commit if passes ?

Thanks,
Prathamesh
>
> Thanks,
> Honza
2017-10-13  Prathamesh Kulkarni  

* cgraph.h (set_malloc_flag): Declare.
* cgraph.c (set_malloc_flag_1): New function.
(set_malloc_flag): Likewise.
* ipa-fnsummary.h (ipa_call_summary): Add new field is_return_callee.
* ipa-fnsummary.c (ipa_call_summary::reset): Set is_return_callee to
false.
(read_ipa_call_summary): Add support for reading is_return_callee.
(write_ipa_call_summary): Stream is_return_callee.
* ipa-inline.c (ipa_inline): Remove call to ipa_free_fn_summary.
* ipa-pure-const.c: Add headers ssa.h, alloc-pool.h, symbol-summary.h,
ipa-prop.h, ipa-fnsummary.h.
(pure_const_names): Change to static.
(malloc_state_e): Define.
(malloc_state_names): Define.
(funct_state_d): Add field malloc_state.
(varying_state): Set malloc_state to STATE_MALLOC_BOTTOM.
(check_retval_uses): New function.
(malloc_candidate_p): Likewise.
(analyze_function): Add support for malloc attribute.
(pure_const_write_summary): Stream malloc_state.
(pure_const_read_summary): Add support for reading malloc_state.
(dump_malloc_lattice): New function.
(propagate_malloc): New function.
(warn_function_malloc): New function.
(ipa_pure_const::execute): Call propagate_malloc and
ipa_free_fn_summary.
(pass_local_pure_const::execute): Add support for malloc attribute.
* ssa-iterators.h (RETURN_FROM_IMM_USE_STMT): New macro.
* doc/invoke.texi: Document Wsuggest-attribute=malloc.

testsuite/
* gcc.dg/ipa/propmalloc-1.c: New test-case.
* gcc.dg/ipa/propmalloc-2.c: Likewise.
* gcc.dg/ipa/propmalloc-3.c: Likewise.

diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 3d0cefbd46b..0aad90d59ea 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -2530,6 +2530,53 @@ cgraph_node::set_nothrow_flag (bool nothrow)
   return changed;
 }
 
+/* Worker to set malloc flag.  */
+static void
+set_malloc_flag_1 (cgraph_node *node, bool malloc_p, bool *changed)
+{
+  if (malloc_p && !DECL_IS_MALLOC (node->decl))
+{
+  DECL_IS_MALLOC (node->decl) = true;
+  *changed = true;
+}
+
+  ipa_ref *ref;
+  FOR_EACH_ALIAS (node, ref)
+{
+  cgraph_node *alias = dyn_cast (ref->referring);
+  if (!malloc_p || alias->get_availability () > AVAIL_INTERPOSABLE)
+   set_malloc_flag_1 (alias, malloc_p, changed);
+}
+
+  for (cgraph_edge *e = node->callers; e; e = e->next_caller)
+if (e->caller->thunk.thunk_p
+   && (!malloc_p || e->caller->get_availability () > AVAIL_INTERPOSABLE))
+  set_malloc_flag_1 (e->caller, malloc_p, changed);
+}
+
+/* Set DECL_IS_MALLOC on NODE's decl and on NODE's aliases if any.  */
+
+bool
+cgraph_node::set_malloc_flag (bool malloc_p)
+{
+  bool changed = false;
+
+ 

Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-10-24 Thread Jan Hubicka
> 2017-10-13  Prathamesh Kulkarni  
> 
>   * cgraph.h (set_malloc_flag): Declare.
>   * cgraph.c (set_malloc_flag_1): New function.
>   (set_malloc_flag): Likewise.
>   * ipa-fnsummary.h (ipa_call_summary): Add new field is_return_callee.
>   * ipa-fnsummary.c (ipa_call_summary::reset): Set is_return_callee to
>   false.
>   (read_ipa_call_summary): Add support for reading is_return_callee.
>   (write_ipa_call_summary): Stream is_return_callee.
>   * ipa-inline.c (ipa_inline): Remove call to ipa_free_fn_summary.
>   * ipa-pure-const.c: Add headers ssa.h, alloc-pool.h, symbol-summary.h,
>   ipa-prop.h, ipa-fnsummary.h.
>   (pure_const_names): Change to static.
>   (malloc_state_e): Define.
>   (malloc_state_names): Define.
>   (funct_state_d): Add field malloc_state.
>   (varying_state): Set malloc_state to STATE_MALLOC_BOTTOM.
>   (check_retval_uses): New function.
>   (malloc_candidate_p): Likewise.
>   (analyze_function): Add support for malloc attribute.
>   (pure_const_write_summary): Stream malloc_state.
>   (pure_const_read_summary): Add support for reading malloc_state.
>   (dump_malloc_lattice): New function.
>   (propagate_malloc): New function.
>   (ipa_pure_const::execute): Call propagate_malloc and
>   ipa_free_fn_summary.
>   (pass_local_pure_const::execute): Add support for malloc attribute.
>   * ssa-iterators.h (RETURN_FROM_IMM_USE_STMT): New macro.
> 
> testsuite/
>   * gcc.dg/ipa/propmalloc-1.c: New test-case.
>   * gcc.dg/ipa/propmalloc-2.c: Likewise.
>   * gcc.dg/ipa/propmalloc-3.c: Likewise.

OK.
Perhaps we could also add -Wsuggest-sttribute=malloc and mention it in 
changes.html?

Thanks,
Honza


Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-10-23 Thread Prathamesh Kulkarni
On 14 October 2017 at 03:20, Prathamesh Kulkarni
 wrote:
> On 7 October 2017 at 12:35, Prathamesh Kulkarni
>  wrote:
>> On 7 October 2017 at 11:23, Jan Hubicka  wrote:
 On 6 October 2017 at 06:04, Jan Hubicka  wrote:
 >> Hi Honza,
 >> Thanks for the detailed suggestions, I have updated the patch 
 >> accordingly.
 >> I have following questions on call_summary:
 >> 1] I added field bool is_return_callee in ipa_call_summary to track
 >> whether the caller possibly returns value returned by callee, which
 >> gets rid of return_callees_map. I assume ipa_call_summary_t::remove()
 >> and ipa_call_summary_t::duplicate() will already take care of handling
 >> late insertion/removal of cgraph nodes ? I just initialized
 >> is_return_callee to false in ipa_call_summary::reset and that seems to
 >> work. I am not sure though if I have handled it correctly. Could you
 >> please check that ?
 >
 > I was actually thinking to introduce separate summary for ipa-pure-const 
 > pass,
 > but this seems fine to me too (for one bit definitly more effecient)
 > ipa_call_summary_t::duplicate copies all the fields, so indeed you 
 > should be
 > safe here.
 >
 > Also it is possible for functions to be inserted late.  Updating of call 
 > summaries
 > is currently handled by ipa_fn_summary_t::insert
 >>
 >> 2] ipa_inline() called ipa_free_fn_summary, which made
 >> ipa_call_summaries unavailable during ipa-pure-const pass. I removed
 >> call to ipa_free_fn_summary from ipa_inline, and moved it to
 >> ipa_pure_const::execute(). Is that OK ?
 >
 > Seems OK to me.
 >>
 >> Patch passes bootstrap+test and lto bootstrap+test on 
 >> x86_64-unknown-linux-gnu.
 >> Verfiied SPEC2k6 compiles and runs without miscompares with LTO
 >> enabled on aarch64-linux-gnu.
 >> Cross-tested on arm*-*-* and aarch64*-*-*. I will additionally test
 >> the patch by building chromium or firefox.
 >> Would it be OK to commit if it passes above validations ?
 >>
 >> Thanks,
 >> Prathamesh
 >> >
 >> > Thanks,
 >> > Honza
 >
 >> 2017-10-05  Prathamesh Kulkarni  
 >>
 >>   * cgraph.h (set_malloc_flag): Declare.
 >>   * cgraph.c (set_malloc_flag_1): New function.
 >>   (set_malloc_flag): Likewise.
 >>   * ipa-fnsummary.h (ipa_call_summary): Add new field 
 >> is_return_callee.
 >>   * ipa-fnsummary.c (ipa_call_summary::reset): Set is_return_callee 
 >> to
 >>   false.
 >>   (read_ipa_call_summary): Add support for reading is_return_callee.
 >>   (write_ipa_call_summary): Stream is_return_callee.
 >>   * ipa-inline.c (ipa_inline): Remove call to ipa_free_fn_summary.
 >>   * ipa-pure-const.c: Add headers ssa.h, alloc-pool.h, 
 >> symbol-summary.h,
 >>   ipa-prop.h, ipa-fnsummary.h.
 >>   (malloc_state_e): Define.
 >>   (malloc_state_names): Define.
 >>   (funct_state_d): Add field malloc_state.
 >>   (varying_state): Set malloc_state to STATE_MALLOC_BOTTOM.
 >>   (check_retval_uses): New function.
 >>   (malloc_candidate_p): Likewise.
 >>   (analyze_function): Add support for malloc attribute.
 >>   (pure_const_write_summary): Stream malloc_state.
 >>   (pure_const_read_summary): Add support for reading malloc_state.
 >>   (dump_malloc_lattice): New function.
 >>   (propagate_malloc): New function.
 >>   (ipa_pure_const::execute): Call propagate_malloc and
 >>   ipa_free_fn_summary.
 >>   (pass_local_pure_const::execute): Add support for malloc 
 >> attribute.
 >>   * ssa-iterators.h (RETURN_FROM_IMM_USE_STMT): New macro.
 >>
 >> testsuite/
 >>   * gcc.dg/ipa/propmalloc-1.c: New test-case.
 >>   * gcc.dg/ipa/propmalloc-2.c: Likewise.
 >>   * gcc.dg/ipa/propmalloc-3.c: Likewise.
 >>
 >> diff --git a/gcc/cgraph.c b/gcc/cgraph.c
 >> index 3d0cefbd46b..0aad90d59ea 100644
 >> --- a/gcc/cgraph.c
 >> +++ b/gcc/cgraph.c
 >> @@ -2530,6 +2530,53 @@ cgraph_node::set_nothrow_flag (bool nothrow)
 >>return changed;
 >>  }
 >>
 >> +/* Worker to set malloc flag.  */
 > New line here I guess (it is below)
 >> +static void
 >> +set_malloc_flag_1 (cgraph_node *node, bool malloc_p, bool *changed)
 >> +{
 >> +  if (malloc_p && !DECL_IS_MALLOC (node->decl))
 >> +{
 >> +  DECL_IS_MALLOC (node->decl) = true;
 >> +  *changed = true;
 >> +}
 >> +
 >> +  ipa_ref *ref;
 >> +  FOR_EACH_ALIAS (node, ref)
 >> +{
 >> +  cgraph_node *alias = dyn_cast (ref->referring);
 >> +  if (!malloc_p || 

Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-10-13 Thread Prathamesh Kulkarni
On 7 October 2017 at 12:35, Prathamesh Kulkarni
 wrote:
> On 7 October 2017 at 11:23, Jan Hubicka  wrote:
>>> On 6 October 2017 at 06:04, Jan Hubicka  wrote:
>>> >> Hi Honza,
>>> >> Thanks for the detailed suggestions, I have updated the patch 
>>> >> accordingly.
>>> >> I have following questions on call_summary:
>>> >> 1] I added field bool is_return_callee in ipa_call_summary to track
>>> >> whether the caller possibly returns value returned by callee, which
>>> >> gets rid of return_callees_map. I assume ipa_call_summary_t::remove()
>>> >> and ipa_call_summary_t::duplicate() will already take care of handling
>>> >> late insertion/removal of cgraph nodes ? I just initialized
>>> >> is_return_callee to false in ipa_call_summary::reset and that seems to
>>> >> work. I am not sure though if I have handled it correctly. Could you
>>> >> please check that ?
>>> >
>>> > I was actually thinking to introduce separate summary for ipa-pure-const 
>>> > pass,
>>> > but this seems fine to me too (for one bit definitly more effecient)
>>> > ipa_call_summary_t::duplicate copies all the fields, so indeed you should 
>>> > be
>>> > safe here.
>>> >
>>> > Also it is possible for functions to be inserted late.  Updating of call 
>>> > summaries
>>> > is currently handled by ipa_fn_summary_t::insert
>>> >>
>>> >> 2] ipa_inline() called ipa_free_fn_summary, which made
>>> >> ipa_call_summaries unavailable during ipa-pure-const pass. I removed
>>> >> call to ipa_free_fn_summary from ipa_inline, and moved it to
>>> >> ipa_pure_const::execute(). Is that OK ?
>>> >
>>> > Seems OK to me.
>>> >>
>>> >> Patch passes bootstrap+test and lto bootstrap+test on 
>>> >> x86_64-unknown-linux-gnu.
>>> >> Verfiied SPEC2k6 compiles and runs without miscompares with LTO
>>> >> enabled on aarch64-linux-gnu.
>>> >> Cross-tested on arm*-*-* and aarch64*-*-*. I will additionally test
>>> >> the patch by building chromium or firefox.
>>> >> Would it be OK to commit if it passes above validations ?
>>> >>
>>> >> Thanks,
>>> >> Prathamesh
>>> >> >
>>> >> > Thanks,
>>> >> > Honza
>>> >
>>> >> 2017-10-05  Prathamesh Kulkarni  
>>> >>
>>> >>   * cgraph.h (set_malloc_flag): Declare.
>>> >>   * cgraph.c (set_malloc_flag_1): New function.
>>> >>   (set_malloc_flag): Likewise.
>>> >>   * ipa-fnsummary.h (ipa_call_summary): Add new field 
>>> >> is_return_callee.
>>> >>   * ipa-fnsummary.c (ipa_call_summary::reset): Set is_return_callee 
>>> >> to
>>> >>   false.
>>> >>   (read_ipa_call_summary): Add support for reading is_return_callee.
>>> >>   (write_ipa_call_summary): Stream is_return_callee.
>>> >>   * ipa-inline.c (ipa_inline): Remove call to ipa_free_fn_summary.
>>> >>   * ipa-pure-const.c: Add headers ssa.h, alloc-pool.h, 
>>> >> symbol-summary.h,
>>> >>   ipa-prop.h, ipa-fnsummary.h.
>>> >>   (malloc_state_e): Define.
>>> >>   (malloc_state_names): Define.
>>> >>   (funct_state_d): Add field malloc_state.
>>> >>   (varying_state): Set malloc_state to STATE_MALLOC_BOTTOM.
>>> >>   (check_retval_uses): New function.
>>> >>   (malloc_candidate_p): Likewise.
>>> >>   (analyze_function): Add support for malloc attribute.
>>> >>   (pure_const_write_summary): Stream malloc_state.
>>> >>   (pure_const_read_summary): Add support for reading malloc_state.
>>> >>   (dump_malloc_lattice): New function.
>>> >>   (propagate_malloc): New function.
>>> >>   (ipa_pure_const::execute): Call propagate_malloc and
>>> >>   ipa_free_fn_summary.
>>> >>   (pass_local_pure_const::execute): Add support for malloc attribute.
>>> >>   * ssa-iterators.h (RETURN_FROM_IMM_USE_STMT): New macro.
>>> >>
>>> >> testsuite/
>>> >>   * gcc.dg/ipa/propmalloc-1.c: New test-case.
>>> >>   * gcc.dg/ipa/propmalloc-2.c: Likewise.
>>> >>   * gcc.dg/ipa/propmalloc-3.c: Likewise.
>>> >>
>>> >> diff --git a/gcc/cgraph.c b/gcc/cgraph.c
>>> >> index 3d0cefbd46b..0aad90d59ea 100644
>>> >> --- a/gcc/cgraph.c
>>> >> +++ b/gcc/cgraph.c
>>> >> @@ -2530,6 +2530,53 @@ cgraph_node::set_nothrow_flag (bool nothrow)
>>> >>return changed;
>>> >>  }
>>> >>
>>> >> +/* Worker to set malloc flag.  */
>>> > New line here I guess (it is below)
>>> >> +static void
>>> >> +set_malloc_flag_1 (cgraph_node *node, bool malloc_p, bool *changed)
>>> >> +{
>>> >> +  if (malloc_p && !DECL_IS_MALLOC (node->decl))
>>> >> +{
>>> >> +  DECL_IS_MALLOC (node->decl) = true;
>>> >> +  *changed = true;
>>> >> +}
>>> >> +
>>> >> +  ipa_ref *ref;
>>> >> +  FOR_EACH_ALIAS (node, ref)
>>> >> +{
>>> >> +  cgraph_node *alias = dyn_cast (ref->referring);
>>> >> +  if (!malloc_p || alias->get_availability () > AVAIL_INTERPOSABLE)
>>> >> + set_malloc_flag_1 (alias, malloc_p, changed);
>>> >> +}
>>> >> +
>>> >> +  for (cgraph_edge *e = node->callers; e; e = e->next_caller)
>>> >> +if 

Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-10-07 Thread Prathamesh Kulkarni
On 7 October 2017 at 11:23, Jan Hubicka  wrote:
>> On 6 October 2017 at 06:04, Jan Hubicka  wrote:
>> >> Hi Honza,
>> >> Thanks for the detailed suggestions, I have updated the patch accordingly.
>> >> I have following questions on call_summary:
>> >> 1] I added field bool is_return_callee in ipa_call_summary to track
>> >> whether the caller possibly returns value returned by callee, which
>> >> gets rid of return_callees_map. I assume ipa_call_summary_t::remove()
>> >> and ipa_call_summary_t::duplicate() will already take care of handling
>> >> late insertion/removal of cgraph nodes ? I just initialized
>> >> is_return_callee to false in ipa_call_summary::reset and that seems to
>> >> work. I am not sure though if I have handled it correctly. Could you
>> >> please check that ?
>> >
>> > I was actually thinking to introduce separate summary for ipa-pure-const 
>> > pass,
>> > but this seems fine to me too (for one bit definitly more effecient)
>> > ipa_call_summary_t::duplicate copies all the fields, so indeed you should 
>> > be
>> > safe here.
>> >
>> > Also it is possible for functions to be inserted late.  Updating of call 
>> > summaries
>> > is currently handled by ipa_fn_summary_t::insert
>> >>
>> >> 2] ipa_inline() called ipa_free_fn_summary, which made
>> >> ipa_call_summaries unavailable during ipa-pure-const pass. I removed
>> >> call to ipa_free_fn_summary from ipa_inline, and moved it to
>> >> ipa_pure_const::execute(). Is that OK ?
>> >
>> > Seems OK to me.
>> >>
>> >> Patch passes bootstrap+test and lto bootstrap+test on 
>> >> x86_64-unknown-linux-gnu.
>> >> Verfiied SPEC2k6 compiles and runs without miscompares with LTO
>> >> enabled on aarch64-linux-gnu.
>> >> Cross-tested on arm*-*-* and aarch64*-*-*. I will additionally test
>> >> the patch by building chromium or firefox.
>> >> Would it be OK to commit if it passes above validations ?
>> >>
>> >> Thanks,
>> >> Prathamesh
>> >> >
>> >> > Thanks,
>> >> > Honza
>> >
>> >> 2017-10-05  Prathamesh Kulkarni  
>> >>
>> >>   * cgraph.h (set_malloc_flag): Declare.
>> >>   * cgraph.c (set_malloc_flag_1): New function.
>> >>   (set_malloc_flag): Likewise.
>> >>   * ipa-fnsummary.h (ipa_call_summary): Add new field 
>> >> is_return_callee.
>> >>   * ipa-fnsummary.c (ipa_call_summary::reset): Set is_return_callee to
>> >>   false.
>> >>   (read_ipa_call_summary): Add support for reading is_return_callee.
>> >>   (write_ipa_call_summary): Stream is_return_callee.
>> >>   * ipa-inline.c (ipa_inline): Remove call to ipa_free_fn_summary.
>> >>   * ipa-pure-const.c: Add headers ssa.h, alloc-pool.h, 
>> >> symbol-summary.h,
>> >>   ipa-prop.h, ipa-fnsummary.h.
>> >>   (malloc_state_e): Define.
>> >>   (malloc_state_names): Define.
>> >>   (funct_state_d): Add field malloc_state.
>> >>   (varying_state): Set malloc_state to STATE_MALLOC_BOTTOM.
>> >>   (check_retval_uses): New function.
>> >>   (malloc_candidate_p): Likewise.
>> >>   (analyze_function): Add support for malloc attribute.
>> >>   (pure_const_write_summary): Stream malloc_state.
>> >>   (pure_const_read_summary): Add support for reading malloc_state.
>> >>   (dump_malloc_lattice): New function.
>> >>   (propagate_malloc): New function.
>> >>   (ipa_pure_const::execute): Call propagate_malloc and
>> >>   ipa_free_fn_summary.
>> >>   (pass_local_pure_const::execute): Add support for malloc attribute.
>> >>   * ssa-iterators.h (RETURN_FROM_IMM_USE_STMT): New macro.
>> >>
>> >> testsuite/
>> >>   * gcc.dg/ipa/propmalloc-1.c: New test-case.
>> >>   * gcc.dg/ipa/propmalloc-2.c: Likewise.
>> >>   * gcc.dg/ipa/propmalloc-3.c: Likewise.
>> >>
>> >> diff --git a/gcc/cgraph.c b/gcc/cgraph.c
>> >> index 3d0cefbd46b..0aad90d59ea 100644
>> >> --- a/gcc/cgraph.c
>> >> +++ b/gcc/cgraph.c
>> >> @@ -2530,6 +2530,53 @@ cgraph_node::set_nothrow_flag (bool nothrow)
>> >>return changed;
>> >>  }
>> >>
>> >> +/* Worker to set malloc flag.  */
>> > New line here I guess (it is below)
>> >> +static void
>> >> +set_malloc_flag_1 (cgraph_node *node, bool malloc_p, bool *changed)
>> >> +{
>> >> +  if (malloc_p && !DECL_IS_MALLOC (node->decl))
>> >> +{
>> >> +  DECL_IS_MALLOC (node->decl) = true;
>> >> +  *changed = true;
>> >> +}
>> >> +
>> >> +  ipa_ref *ref;
>> >> +  FOR_EACH_ALIAS (node, ref)
>> >> +{
>> >> +  cgraph_node *alias = dyn_cast (ref->referring);
>> >> +  if (!malloc_p || alias->get_availability () > AVAIL_INTERPOSABLE)
>> >> + set_malloc_flag_1 (alias, malloc_p, changed);
>> >> +}
>> >> +
>> >> +  for (cgraph_edge *e = node->callers; e; e = e->next_caller)
>> >> +if (e->caller->thunk.thunk_p
>> >> + && (!malloc_p || e->caller->get_availability () > 
>> >> AVAIL_INTERPOSABLE))
>> >> +  set_malloc_flag_1 (e->caller, malloc_p, changed);
>> >> +}
>> >> +
>> >> +/* Set 

Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-10-07 Thread Jan Hubicka
> On 6 October 2017 at 06:04, Jan Hubicka  wrote:
> >> Hi Honza,
> >> Thanks for the detailed suggestions, I have updated the patch accordingly.
> >> I have following questions on call_summary:
> >> 1] I added field bool is_return_callee in ipa_call_summary to track
> >> whether the caller possibly returns value returned by callee, which
> >> gets rid of return_callees_map. I assume ipa_call_summary_t::remove()
> >> and ipa_call_summary_t::duplicate() will already take care of handling
> >> late insertion/removal of cgraph nodes ? I just initialized
> >> is_return_callee to false in ipa_call_summary::reset and that seems to
> >> work. I am not sure though if I have handled it correctly. Could you
> >> please check that ?
> >
> > I was actually thinking to introduce separate summary for ipa-pure-const 
> > pass,
> > but this seems fine to me too (for one bit definitly more effecient)
> > ipa_call_summary_t::duplicate copies all the fields, so indeed you should be
> > safe here.
> >
> > Also it is possible for functions to be inserted late.  Updating of call 
> > summaries
> > is currently handled by ipa_fn_summary_t::insert
> >>
> >> 2] ipa_inline() called ipa_free_fn_summary, which made
> >> ipa_call_summaries unavailable during ipa-pure-const pass. I removed
> >> call to ipa_free_fn_summary from ipa_inline, and moved it to
> >> ipa_pure_const::execute(). Is that OK ?
> >
> > Seems OK to me.
> >>
> >> Patch passes bootstrap+test and lto bootstrap+test on 
> >> x86_64-unknown-linux-gnu.
> >> Verfiied SPEC2k6 compiles and runs without miscompares with LTO
> >> enabled on aarch64-linux-gnu.
> >> Cross-tested on arm*-*-* and aarch64*-*-*. I will additionally test
> >> the patch by building chromium or firefox.
> >> Would it be OK to commit if it passes above validations ?
> >>
> >> Thanks,
> >> Prathamesh
> >> >
> >> > Thanks,
> >> > Honza
> >
> >> 2017-10-05  Prathamesh Kulkarni  
> >>
> >>   * cgraph.h (set_malloc_flag): Declare.
> >>   * cgraph.c (set_malloc_flag_1): New function.
> >>   (set_malloc_flag): Likewise.
> >>   * ipa-fnsummary.h (ipa_call_summary): Add new field is_return_callee.
> >>   * ipa-fnsummary.c (ipa_call_summary::reset): Set is_return_callee to
> >>   false.
> >>   (read_ipa_call_summary): Add support for reading is_return_callee.
> >>   (write_ipa_call_summary): Stream is_return_callee.
> >>   * ipa-inline.c (ipa_inline): Remove call to ipa_free_fn_summary.
> >>   * ipa-pure-const.c: Add headers ssa.h, alloc-pool.h, 
> >> symbol-summary.h,
> >>   ipa-prop.h, ipa-fnsummary.h.
> >>   (malloc_state_e): Define.
> >>   (malloc_state_names): Define.
> >>   (funct_state_d): Add field malloc_state.
> >>   (varying_state): Set malloc_state to STATE_MALLOC_BOTTOM.
> >>   (check_retval_uses): New function.
> >>   (malloc_candidate_p): Likewise.
> >>   (analyze_function): Add support for malloc attribute.
> >>   (pure_const_write_summary): Stream malloc_state.
> >>   (pure_const_read_summary): Add support for reading malloc_state.
> >>   (dump_malloc_lattice): New function.
> >>   (propagate_malloc): New function.
> >>   (ipa_pure_const::execute): Call propagate_malloc and
> >>   ipa_free_fn_summary.
> >>   (pass_local_pure_const::execute): Add support for malloc attribute.
> >>   * ssa-iterators.h (RETURN_FROM_IMM_USE_STMT): New macro.
> >>
> >> testsuite/
> >>   * gcc.dg/ipa/propmalloc-1.c: New test-case.
> >>   * gcc.dg/ipa/propmalloc-2.c: Likewise.
> >>   * gcc.dg/ipa/propmalloc-3.c: Likewise.
> >>
> >> diff --git a/gcc/cgraph.c b/gcc/cgraph.c
> >> index 3d0cefbd46b..0aad90d59ea 100644
> >> --- a/gcc/cgraph.c
> >> +++ b/gcc/cgraph.c
> >> @@ -2530,6 +2530,53 @@ cgraph_node::set_nothrow_flag (bool nothrow)
> >>return changed;
> >>  }
> >>
> >> +/* Worker to set malloc flag.  */
> > New line here I guess (it is below)
> >> +static void
> >> +set_malloc_flag_1 (cgraph_node *node, bool malloc_p, bool *changed)
> >> +{
> >> +  if (malloc_p && !DECL_IS_MALLOC (node->decl))
> >> +{
> >> +  DECL_IS_MALLOC (node->decl) = true;
> >> +  *changed = true;
> >> +}
> >> +
> >> +  ipa_ref *ref;
> >> +  FOR_EACH_ALIAS (node, ref)
> >> +{
> >> +  cgraph_node *alias = dyn_cast (ref->referring);
> >> +  if (!malloc_p || alias->get_availability () > AVAIL_INTERPOSABLE)
> >> + set_malloc_flag_1 (alias, malloc_p, changed);
> >> +}
> >> +
> >> +  for (cgraph_edge *e = node->callers; e; e = e->next_caller)
> >> +if (e->caller->thunk.thunk_p
> >> + && (!malloc_p || e->caller->get_availability () > 
> >> AVAIL_INTERPOSABLE))
> >> +  set_malloc_flag_1 (e->caller, malloc_p, changed);
> >> +}
> >> +
> >> +/* Set DECL_IS_MALLOC on NODE's decl and on NODE's aliases if any.  */
> >> +
> >> +bool
> >> +cgraph_node::set_malloc_flag (bool malloc_p)
> >> +{
> >> +  bool changed = false;
> >> +
> >> +  if (!malloc_p 

Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-10-06 Thread Prathamesh Kulkarni
On 6 October 2017 at 06:04, Jan Hubicka  wrote:
>> Hi Honza,
>> Thanks for the detailed suggestions, I have updated the patch accordingly.
>> I have following questions on call_summary:
>> 1] I added field bool is_return_callee in ipa_call_summary to track
>> whether the caller possibly returns value returned by callee, which
>> gets rid of return_callees_map. I assume ipa_call_summary_t::remove()
>> and ipa_call_summary_t::duplicate() will already take care of handling
>> late insertion/removal of cgraph nodes ? I just initialized
>> is_return_callee to false in ipa_call_summary::reset and that seems to
>> work. I am not sure though if I have handled it correctly. Could you
>> please check that ?
>
> I was actually thinking to introduce separate summary for ipa-pure-const pass,
> but this seems fine to me too (for one bit definitly more effecient)
> ipa_call_summary_t::duplicate copies all the fields, so indeed you should be
> safe here.
>
> Also it is possible for functions to be inserted late.  Updating of call 
> summaries
> is currently handled by ipa_fn_summary_t::insert
>>
>> 2] ipa_inline() called ipa_free_fn_summary, which made
>> ipa_call_summaries unavailable during ipa-pure-const pass. I removed
>> call to ipa_free_fn_summary from ipa_inline, and moved it to
>> ipa_pure_const::execute(). Is that OK ?
>
> Seems OK to me.
>>
>> Patch passes bootstrap+test and lto bootstrap+test on 
>> x86_64-unknown-linux-gnu.
>> Verfiied SPEC2k6 compiles and runs without miscompares with LTO
>> enabled on aarch64-linux-gnu.
>> Cross-tested on arm*-*-* and aarch64*-*-*. I will additionally test
>> the patch by building chromium or firefox.
>> Would it be OK to commit if it passes above validations ?
>>
>> Thanks,
>> Prathamesh
>> >
>> > Thanks,
>> > Honza
>
>> 2017-10-05  Prathamesh Kulkarni  
>>
>>   * cgraph.h (set_malloc_flag): Declare.
>>   * cgraph.c (set_malloc_flag_1): New function.
>>   (set_malloc_flag): Likewise.
>>   * ipa-fnsummary.h (ipa_call_summary): Add new field is_return_callee.
>>   * ipa-fnsummary.c (ipa_call_summary::reset): Set is_return_callee to
>>   false.
>>   (read_ipa_call_summary): Add support for reading is_return_callee.
>>   (write_ipa_call_summary): Stream is_return_callee.
>>   * ipa-inline.c (ipa_inline): Remove call to ipa_free_fn_summary.
>>   * ipa-pure-const.c: Add headers ssa.h, alloc-pool.h, symbol-summary.h,
>>   ipa-prop.h, ipa-fnsummary.h.
>>   (malloc_state_e): Define.
>>   (malloc_state_names): Define.
>>   (funct_state_d): Add field malloc_state.
>>   (varying_state): Set malloc_state to STATE_MALLOC_BOTTOM.
>>   (check_retval_uses): New function.
>>   (malloc_candidate_p): Likewise.
>>   (analyze_function): Add support for malloc attribute.
>>   (pure_const_write_summary): Stream malloc_state.
>>   (pure_const_read_summary): Add support for reading malloc_state.
>>   (dump_malloc_lattice): New function.
>>   (propagate_malloc): New function.
>>   (ipa_pure_const::execute): Call propagate_malloc and
>>   ipa_free_fn_summary.
>>   (pass_local_pure_const::execute): Add support for malloc attribute.
>>   * ssa-iterators.h (RETURN_FROM_IMM_USE_STMT): New macro.
>>
>> testsuite/
>>   * gcc.dg/ipa/propmalloc-1.c: New test-case.
>>   * gcc.dg/ipa/propmalloc-2.c: Likewise.
>>   * gcc.dg/ipa/propmalloc-3.c: Likewise.
>>
>> diff --git a/gcc/cgraph.c b/gcc/cgraph.c
>> index 3d0cefbd46b..0aad90d59ea 100644
>> --- a/gcc/cgraph.c
>> +++ b/gcc/cgraph.c
>> @@ -2530,6 +2530,53 @@ cgraph_node::set_nothrow_flag (bool nothrow)
>>return changed;
>>  }
>>
>> +/* Worker to set malloc flag.  */
> New line here I guess (it is below)
>> +static void
>> +set_malloc_flag_1 (cgraph_node *node, bool malloc_p, bool *changed)
>> +{
>> +  if (malloc_p && !DECL_IS_MALLOC (node->decl))
>> +{
>> +  DECL_IS_MALLOC (node->decl) = true;
>> +  *changed = true;
>> +}
>> +
>> +  ipa_ref *ref;
>> +  FOR_EACH_ALIAS (node, ref)
>> +{
>> +  cgraph_node *alias = dyn_cast (ref->referring);
>> +  if (!malloc_p || alias->get_availability () > AVAIL_INTERPOSABLE)
>> + set_malloc_flag_1 (alias, malloc_p, changed);
>> +}
>> +
>> +  for (cgraph_edge *e = node->callers; e; e = e->next_caller)
>> +if (e->caller->thunk.thunk_p
>> + && (!malloc_p || e->caller->get_availability () > AVAIL_INTERPOSABLE))
>> +  set_malloc_flag_1 (e->caller, malloc_p, changed);
>> +}
>> +
>> +/* Set DECL_IS_MALLOC on NODE's decl and on NODE's aliases if any.  */
>> +
>> +bool
>> +cgraph_node::set_malloc_flag (bool malloc_p)
>> +{
>> +  bool changed = false;
>> +
>> +  if (!malloc_p || get_availability () > AVAIL_INTERPOSABLE)
>> +set_malloc_flag_1 (this, malloc_p, );
>> +  else
>> +{
>> +  ipa_ref *ref;
>> +
>> +  FOR_EACH_ALIAS (this, ref)
>> + {
>> +   cgraph_node *alias = dyn_cast (ref->referring);
>> +   

Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-10-06 Thread Jan Hubicka
> Hi Honza,
> Thanks for the detailed suggestions, I have updated the patch accordingly.
> I have following questions on call_summary:
> 1] I added field bool is_return_callee in ipa_call_summary to track
> whether the caller possibly returns value returned by callee, which
> gets rid of return_callees_map. I assume ipa_call_summary_t::remove()
> and ipa_call_summary_t::duplicate() will already take care of handling
> late insertion/removal of cgraph nodes ? I just initialized
> is_return_callee to false in ipa_call_summary::reset and that seems to
> work. I am not sure though if I have handled it correctly. Could you
> please check that ?

I was actually thinking to introduce separate summary for ipa-pure-const pass,
but this seems fine to me too (for one bit definitly more effecient)
ipa_call_summary_t::duplicate copies all the fields, so indeed you should be
safe here.  

Also it is possible for functions to be inserted late.  Updating of call 
summaries
is currently handled by ipa_fn_summary_t::insert
> 
> 2] ipa_inline() called ipa_free_fn_summary, which made
> ipa_call_summaries unavailable during ipa-pure-const pass. I removed
> call to ipa_free_fn_summary from ipa_inline, and moved it to
> ipa_pure_const::execute(). Is that OK ?

Seems OK to me.
> 
> Patch passes bootstrap+test and lto bootstrap+test on 
> x86_64-unknown-linux-gnu.
> Verfiied SPEC2k6 compiles and runs without miscompares with LTO
> enabled on aarch64-linux-gnu.
> Cross-tested on arm*-*-* and aarch64*-*-*. I will additionally test
> the patch by building chromium or firefox.
> Would it be OK to commit if it passes above validations ?
> 
> Thanks,
> Prathamesh
> >
> > Thanks,
> > Honza

> 2017-10-05  Prathamesh Kulkarni  
> 
>   * cgraph.h (set_malloc_flag): Declare.
>   * cgraph.c (set_malloc_flag_1): New function.
>   (set_malloc_flag): Likewise.
>   * ipa-fnsummary.h (ipa_call_summary): Add new field is_return_callee.
>   * ipa-fnsummary.c (ipa_call_summary::reset): Set is_return_callee to
>   false.
>   (read_ipa_call_summary): Add support for reading is_return_callee.
>   (write_ipa_call_summary): Stream is_return_callee.
>   * ipa-inline.c (ipa_inline): Remove call to ipa_free_fn_summary.
>   * ipa-pure-const.c: Add headers ssa.h, alloc-pool.h, symbol-summary.h,
>   ipa-prop.h, ipa-fnsummary.h.
>   (malloc_state_e): Define.
>   (malloc_state_names): Define.
>   (funct_state_d): Add field malloc_state.
>   (varying_state): Set malloc_state to STATE_MALLOC_BOTTOM.
>   (check_retval_uses): New function.
>   (malloc_candidate_p): Likewise.
>   (analyze_function): Add support for malloc attribute.
>   (pure_const_write_summary): Stream malloc_state.
>   (pure_const_read_summary): Add support for reading malloc_state.
>   (dump_malloc_lattice): New function.
>   (propagate_malloc): New function.
>   (ipa_pure_const::execute): Call propagate_malloc and
>   ipa_free_fn_summary.
>   (pass_local_pure_const::execute): Add support for malloc attribute.
>   * ssa-iterators.h (RETURN_FROM_IMM_USE_STMT): New macro.
> 
> testsuite/
>   * gcc.dg/ipa/propmalloc-1.c: New test-case.
>   * gcc.dg/ipa/propmalloc-2.c: Likewise.
>   * gcc.dg/ipa/propmalloc-3.c: Likewise.
> 
> diff --git a/gcc/cgraph.c b/gcc/cgraph.c
> index 3d0cefbd46b..0aad90d59ea 100644
> --- a/gcc/cgraph.c
> +++ b/gcc/cgraph.c
> @@ -2530,6 +2530,53 @@ cgraph_node::set_nothrow_flag (bool nothrow)
>return changed;
>  }
>  
> +/* Worker to set malloc flag.  */
New line here I guess (it is below)
> +static void
> +set_malloc_flag_1 (cgraph_node *node, bool malloc_p, bool *changed)
> +{
> +  if (malloc_p && !DECL_IS_MALLOC (node->decl))
> +{
> +  DECL_IS_MALLOC (node->decl) = true;
> +  *changed = true;
> +}
> +
> +  ipa_ref *ref;
> +  FOR_EACH_ALIAS (node, ref)
> +{
> +  cgraph_node *alias = dyn_cast (ref->referring);
> +  if (!malloc_p || alias->get_availability () > AVAIL_INTERPOSABLE)
> + set_malloc_flag_1 (alias, malloc_p, changed);
> +}
> +
> +  for (cgraph_edge *e = node->callers; e; e = e->next_caller)
> +if (e->caller->thunk.thunk_p
> + && (!malloc_p || e->caller->get_availability () > AVAIL_INTERPOSABLE))
> +  set_malloc_flag_1 (e->caller, malloc_p, changed);
> +}
> +
> +/* Set DECL_IS_MALLOC on NODE's decl and on NODE's aliases if any.  */
> +
> +bool
> +cgraph_node::set_malloc_flag (bool malloc_p)
> +{
> +  bool changed = false;
> +
> +  if (!malloc_p || get_availability () > AVAIL_INTERPOSABLE)
> +set_malloc_flag_1 (this, malloc_p, );
> +  else
> +{
> +  ipa_ref *ref;
> +
> +  FOR_EACH_ALIAS (this, ref)
> + {
> +   cgraph_node *alias = dyn_cast (ref->referring);
> +   if (!malloc_p || alias->get_availability () > AVAIL_INTERPOSABLE)
> + set_malloc_flag_1 (alias, malloc_p, );
> + }
> +}
> +  return changed;
> +}
> +
> diff --git 

Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-10-05 Thread Prathamesh Kulkarni
On 29 September 2017 at 12:28, Jan Hubicka  wrote:
>> > I wonder what happens here when, say, ipa-icf redirect the call to 
>> > eqivaelnt
>> > function and removes the callee?  Perhaps we realy want to have set of call
>> > sites rahter than nodes stored from analysis to execution. Call sites have
>> > unique stmts and uids, so it will be possible to map them back and forth.
>> IIUC, call site is represented using cgraph_edge ?
>
> Yes, there is call_stmt pointer when gimple is read and uid in WPA mode which
> are unique.  There is call_summary class which lets you to associate info with
> a call site. While it is not most memory effecient to store one bit of 
> information
> this way, i guess it may be easiest to use it in anticipation of it becoming
> more useful in foreseeable future.  We have some bits in call edge itself that
> may be shuffled to the summary as well.
>
>> So change return_callees_map to be the mapping from node to subset of
>> it's call-sites where
>> node returns the value of one of it's callees:
>> static hash_map< cgraph_node *, vec *> *return_callees_map; ?
>
> This should work too, though storing direct pointers to edges is something we
> probably want to avoid to keep memory representation of edges in bounds.
>
> I would go with call_summary - everything else seems like bit of premature
> optimization.  If we will decide to optimize it later, we may invent a variant
> of summary datatype for that.
Hi Honza,
Thanks for the detailed suggestions, I have updated the patch accordingly.
I have following questions on call_summary:
1] I added field bool is_return_callee in ipa_call_summary to track
whether the caller possibly returns value returned by callee, which
gets rid of return_callees_map. I assume ipa_call_summary_t::remove()
and ipa_call_summary_t::duplicate() will already take care of handling
late insertion/removal of cgraph nodes ? I just initialized
is_return_callee to false in ipa_call_summary::reset and that seems to
work. I am not sure though if I have handled it correctly. Could you
please check that ?

2] ipa_inline() called ipa_free_fn_summary, which made
ipa_call_summaries unavailable during ipa-pure-const pass. I removed
call to ipa_free_fn_summary from ipa_inline, and moved it to
ipa_pure_const::execute(). Is that OK ?

Patch passes bootstrap+test and lto bootstrap+test on x86_64-unknown-linux-gnu.
Verfiied SPEC2k6 compiles and runs without miscompares with LTO
enabled on aarch64-linux-gnu.
Cross-tested on arm*-*-* and aarch64*-*-*. I will additionally test
the patch by building chromium or firefox.
Would it be OK to commit if it passes above validations ?

Thanks,
Prathamesh
>
> Thanks,
> Honza
2017-10-05  Prathamesh Kulkarni  

* cgraph.h (set_malloc_flag): Declare.
* cgraph.c (set_malloc_flag_1): New function.
(set_malloc_flag): Likewise.
* ipa-fnsummary.h (ipa_call_summary): Add new field is_return_callee.
* ipa-fnsummary.c (ipa_call_summary::reset): Set is_return_callee to
false.
(read_ipa_call_summary): Add support for reading is_return_callee.
(write_ipa_call_summary): Stream is_return_callee.
* ipa-inline.c (ipa_inline): Remove call to ipa_free_fn_summary.
* ipa-pure-const.c: Add headers ssa.h, alloc-pool.h, symbol-summary.h,
ipa-prop.h, ipa-fnsummary.h.
(malloc_state_e): Define.
(malloc_state_names): Define.
(funct_state_d): Add field malloc_state.
(varying_state): Set malloc_state to STATE_MALLOC_BOTTOM.
(check_retval_uses): New function.
(malloc_candidate_p): Likewise.
(analyze_function): Add support for malloc attribute.
(pure_const_write_summary): Stream malloc_state.
(pure_const_read_summary): Add support for reading malloc_state.
(dump_malloc_lattice): New function.
(propagate_malloc): New function.
(ipa_pure_const::execute): Call propagate_malloc and
ipa_free_fn_summary.
(pass_local_pure_const::execute): Add support for malloc attribute.
* ssa-iterators.h (RETURN_FROM_IMM_USE_STMT): New macro.

testsuite/
* gcc.dg/ipa/propmalloc-1.c: New test-case.
* gcc.dg/ipa/propmalloc-2.c: Likewise.
* gcc.dg/ipa/propmalloc-3.c: Likewise.

diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 3d0cefbd46b..0aad90d59ea 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -2530,6 +2530,53 @@ cgraph_node::set_nothrow_flag (bool nothrow)
   return changed;
 }
 
+/* Worker to set malloc flag.  */
+static void
+set_malloc_flag_1 (cgraph_node *node, bool malloc_p, bool *changed)
+{
+  if (malloc_p && !DECL_IS_MALLOC (node->decl))
+{
+  DECL_IS_MALLOC (node->decl) = true;
+  *changed = true;
+}
+
+  ipa_ref *ref;
+  FOR_EACH_ALIAS (node, ref)
+{
+  cgraph_node *alias = dyn_cast (ref->referring);
+  if (!malloc_p || alias->get_availability () > AVAIL_INTERPOSABLE)
+   

Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-09-29 Thread Jan Hubicka
> > I wonder what happens here when, say, ipa-icf redirect the call to eqivaelnt
> > function and removes the callee?  Perhaps we realy want to have set of call
> > sites rahter than nodes stored from analysis to execution. Call sites have
> > unique stmts and uids, so it will be possible to map them back and forth.
> IIUC, call site is represented using cgraph_edge ?

Yes, there is call_stmt pointer when gimple is read and uid in WPA mode which
are unique.  There is call_summary class which lets you to associate info with
a call site. While it is not most memory effecient to store one bit of 
information
this way, i guess it may be easiest to use it in anticipation of it becoming
more useful in foreseeable future.  We have some bits in call edge itself that
may be shuffled to the summary as well.

> So change return_callees_map to be the mapping from node to subset of
> it's call-sites where
> node returns the value of one of it's callees:
> static hash_map< cgraph_node *, vec *> *return_callees_map; ?

This should work too, though storing direct pointers to edges is something we
probably want to avoid to keep memory representation of edges in bounds.

I would go with call_summary - everything else seems like bit of premature
optimization.  If we will decide to optimize it later, we may invent a variant
of summary datatype for that.

Thanks,
Honza


Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-09-26 Thread Prathamesh Kulkarni
On 25 September 2017 at 17:24, Jan Hubicka  wrote:
>> Hi Honza,
>> Could you please have a look at this patch ?
>> https://gcc.gnu.org/ml/gcc-patches/2017-07/msg02063.html
>
> I can and I should have done long time ago. I really apologize for slow 
> response
> and I will try to be more timely from now on. The reason was that I had some
> patches that I was thinking I would like to push out first, but I guess since
> they are still not ready it is better to go other way around.
No worries, and thanks for the feedback!
>
> +/* A map from node to subset of callees. The subset contains those callees
> + * whose return-value is returned by the node. */
> +static hash_map< cgraph_node *, vec* > *return_callees_map;
>
> Extra * at the beggining of line.  It would make more sense to put those
> and the other bits into function_summary rather than using the hooks
> but that is something we co do incrementally.
>
> I wonder what happens here when, say, ipa-icf redirect the call to eqivaelnt
> function and removes the callee?  Perhaps we realy want to have set of call
> sites rahter than nodes stored from analysis to execution. Call sites have
> unique stmts and uids, so it will be possible to map them back and forth.
IIUC, call site is represented using cgraph_edge ?
So change return_callees_map to be the mapping from node to subset of
it's call-sites where
node returns the value of one of it's callees:
static hash_map< cgraph_node *, vec *> *return_callees_map; ?
>
> +static bool
> +check_retval_uses (tree retval, gimple *stmt)
> +{
>
> there is missing toplevel comment on those.
>
> +/*
> + * Currently this function does a very conservative analysis to check if
> + * function could be a malloc candidate.
> + *
> + * The function is considered to be a candidate if
> + * 1) The function returns a value of pointer type.
> + * 2) SSA_NAME_DEF_STMT (return_value) is either a function call or
> + *a phi, and element of phi is either NULL or
> + *SSA_NAME_DEF_STMT(element) is function call.
> + * 3) The return-value has immediate uses only within comparisons (gcond or 
> gassign)
> + *and return_stmt (and likewise a phi arg has immediate use only within 
> comparison
> + *or the phi stmt).
> + */
>
> Now * in begginig of lines. Theoretically by coding standards the comment
> should start with description of what function does and what are the 
> parameters.
> I believe Richi already commented on this part - which is more of his domain,
> but it seems fine to me.
>
> Pehraps with -details dump it would be nice to dump reason why the malloc
> candidate was rejected.
>
> +DEBUG_FUNCTION
> +static void
> +dump_malloc_lattice (FILE *dump_file, const char *s)
>
> +static void
> +propagate_malloc (void)
>
> For coding standards, please add block comments.
Thanks for the suggestions, I will try to address them in the next
version of the patch.

Regards,
Prathamesh
>
> With these changes the patch looks good to me!
> Honza
>
>>
>> I tested it with SPEC2006 on AArch64 Cortex-a57 processor and saw some
>> improvement for
>> 433.milc (+1.79%), 437.leslie3d (+2.84%) and 470.lbm (+4%) and not
>> much differences for other benchmarks.
>> I don't expect them to be precise though, it was run with only one
>> iteration of SPEC.
>> Thanks!
>>
>> Regards,
>> Prathamesh
>> >
>> > Thanks,
>> > Prathamesh
>> >>
>> >> Thanks,
>> >> Prathamesh
>> >>>
>> >>> Thanks,
>> >>> Prathamesh
>> 
>>  Thanks,
>>  Prathamesh
>> >
>> > Regards,
>> > Prathamesh
>> >>
>> >> Thanks,
>> >> Prathamesh
>> >>>
>> >>> Honza


Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-09-25 Thread Jan Hubicka
> Hi Honza,
> Could you please have a look at this patch ?
> https://gcc.gnu.org/ml/gcc-patches/2017-07/msg02063.html

I can and I should have done long time ago. I really apologize for slow response
and I will try to be more timely from now on. The reason was that I had some
patches that I was thinking I would like to push out first, but I guess since
they are still not ready it is better to go other way around.

+/* A map from node to subset of callees. The subset contains those callees
+ * whose return-value is returned by the node. */
+static hash_map< cgraph_node *, vec* > *return_callees_map;

Extra * at the beggining of line.  It would make more sense to put those
and the other bits into function_summary rather than using the hooks
but that is something we co do incrementally.

I wonder what happens here when, say, ipa-icf redirect the call to eqivaelnt
function and removes the callee?  Perhaps we realy want to have set of call
sites rahter than nodes stored from analysis to execution. Call sites have
unique stmts and uids, so it will be possible to map them back and forth.

+static bool
+check_retval_uses (tree retval, gimple *stmt)
+{

there is missing toplevel comment on those.

+/*
+ * Currently this function does a very conservative analysis to check if
+ * function could be a malloc candidate.
+ *
+ * The function is considered to be a candidate if
+ * 1) The function returns a value of pointer type.
+ * 2) SSA_NAME_DEF_STMT (return_value) is either a function call or
+ *a phi, and element of phi is either NULL or
+ *SSA_NAME_DEF_STMT(element) is function call.
+ * 3) The return-value has immediate uses only within comparisons (gcond or 
gassign)
+ *and return_stmt (and likewise a phi arg has immediate use only within 
comparison
+ *or the phi stmt).
+ */

Now * in begginig of lines. Theoretically by coding standards the comment
should start with description of what function does and what are the parameters.
I believe Richi already commented on this part - which is more of his domain,
but it seems fine to me.

Pehraps with -details dump it would be nice to dump reason why the malloc
candidate was rejected.

+DEBUG_FUNCTION
+static void
+dump_malloc_lattice (FILE *dump_file, const char *s)

+static void
+propagate_malloc (void)

For coding standards, please add block comments.

With these changes the patch looks good to me!
Honza

> 
> I tested it with SPEC2006 on AArch64 Cortex-a57 processor and saw some
> improvement for
> 433.milc (+1.79%), 437.leslie3d (+2.84%) and 470.lbm (+4%) and not
> much differences for other benchmarks.
> I don't expect them to be precise though, it was run with only one
> iteration of SPEC.
> Thanks!
> 
> Regards,
> Prathamesh
> >
> > Thanks,
> > Prathamesh
> >>
> >> Thanks,
> >> Prathamesh
> >>>
> >>> Thanks,
> >>> Prathamesh
> 
>  Thanks,
>  Prathamesh
> >
> > Regards,
> > Prathamesh
> >>
> >> Thanks,
> >> Prathamesh
> >>>
> >>> Honza


Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-09-25 Thread Prathamesh Kulkarni
On 15 September 2017 at 17:49, Prathamesh Kulkarni
 wrote:
> On 1 September 2017 at 08:09, Prathamesh Kulkarni
>  wrote:
>> On 17 August 2017 at 18:02, Prathamesh Kulkarni
>>  wrote:
>>> On 8 August 2017 at 09:50, Prathamesh Kulkarni
>>>  wrote:
 On 31 July 2017 at 23:53, Prathamesh Kulkarni
  wrote:
> On 23 May 2017 at 19:10, Prathamesh Kulkarni
>  wrote:
>> On 19 May 2017 at 19:02, Jan Hubicka  wrote:

 * LTO and memory management
 This is a general question about LTO and memory management.
 IIUC the following sequence takes place during normal LTO:
 LGEN: generate_summary, write_summary
 WPA: read_summary, execute ipa passes, write_opt_summary

 So I assumed it was OK in LGEN to allocate return_callees_map in
 generate_summary and free it in write_summary and during WPA, allocate
 return_callees_map in read_summary and free it after execute (since
 write_opt_summary does not require return_callees_map).

 However with fat LTO, it seems the sequence changes for LGEN with
 execute phase takes place after write_summary. However since
 return_callees_map is freed in pure_const_write_summary and
 propagate_malloc() accesses it in execute stage, it results in
 segmentation fault.

 To work around this, I am using the following hack in 
 pure_const_write_summary:
 // FIXME: Do not free if -ffat-lto-objects is enabled.
 if (!global_options.x_flag_fat_lto_objects)
   free_return_callees_map ();
 Is there a better approach for handling this ?
>>>
>>> I think most passes just do not free summaries with -flto.  We probably 
>>> want
>>> to fix it to make it possible to compile multiple units i.e. from 
>>> plugin by
>>> adding release_summaries method...
>>> So I would say it is OK to do the same as others do and leak it with 
>>> -flto.
 diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
 index e457166ea39..724c26e03f6 100644
 --- a/gcc/ipa-pure-const.c
 +++ b/gcc/ipa-pure-const.c
 @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
  #include "tree-scalar-evolution.h"
  #include "intl.h"
  #include "opts.h"
 +#include "ssa.h"

  /* Lattice values for const and pure functions.  Everything starts out
 being const, then may drop to pure and then neither depending on
 @@ -69,6 +70,15 @@ enum pure_const_state_e

  const char *pure_const_names[3] = {"const", "pure", "neither"};

 +enum malloc_state_e
 +{
 +  PURE_CONST_MALLOC_TOP,
 +  PURE_CONST_MALLOC,
 +  PURE_CONST_MALLOC_BOTTOM
 +};
>>>
>>> It took me a while to work out what PURE_CONST means here :)
>>> I would just call it something like STATE_MALLOC_TOP... or so.
>>> ipa_pure_const is outdated name from the time pass was doing only
>>> those two.
 @@ -109,6 +121,10 @@ typedef struct funct_state_d * funct_state;

  static vec funct_state_vec;

 +/* A map from node to subset of callees. The subset contains those 
 callees
 + * whose return-value is returned by the node. */
 +static hash_map< cgraph_node *, vec* > 
 *return_callees_map;
 +
>>>
>>> Hehe, a special case of return jump function.  We ought to support 
>>> those more generally.
>>> How do you keep it up to date over callgraph changes?
 @@ -921,6 +1055,23 @@ end:
if (TREE_NOTHROW (decl))
  l->can_throw = false;

 +  if (ipa)
 +{
 +  vec v = vNULL;
 +  l->malloc_state = PURE_CONST_MALLOC_BOTTOM;
 +  if (DECL_IS_MALLOC (decl))
 + l->malloc_state = PURE_CONST_MALLOC;
 +  else if (malloc_candidate_p (DECL_STRUCT_FUNCTION (decl), v))
 + {
 +   l->malloc_state = PURE_CONST_MALLOC_TOP;
 +   vec *callees_p = new vec (vNULL);
 +   for (unsigned i = 0; i < v.length (); ++i)
 + callees_p->safe_push (v[i]);
 +   return_callees_map->put (fn, callees_p);
 + }
 +  v.release ();
 +}
 +
>>>
>>> I would do non-ipa variant, too.  I think most attributes can be 
>>> detected that way
>>> as well.
>>>
>>> The patch generally makes sense to me.  It would be nice to make it 
>>> easier to write such
>>> a basic propagators across callgraph (perhaps adding a template doing 
>>> the basic
>>> 

Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-09-15 Thread Prathamesh Kulkarni
On 1 September 2017 at 08:09, Prathamesh Kulkarni
 wrote:
> On 17 August 2017 at 18:02, Prathamesh Kulkarni
>  wrote:
>> On 8 August 2017 at 09:50, Prathamesh Kulkarni
>>  wrote:
>>> On 31 July 2017 at 23:53, Prathamesh Kulkarni
>>>  wrote:
 On 23 May 2017 at 19:10, Prathamesh Kulkarni
  wrote:
> On 19 May 2017 at 19:02, Jan Hubicka  wrote:
>>>
>>> * LTO and memory management
>>> This is a general question about LTO and memory management.
>>> IIUC the following sequence takes place during normal LTO:
>>> LGEN: generate_summary, write_summary
>>> WPA: read_summary, execute ipa passes, write_opt_summary
>>>
>>> So I assumed it was OK in LGEN to allocate return_callees_map in
>>> generate_summary and free it in write_summary and during WPA, allocate
>>> return_callees_map in read_summary and free it after execute (since
>>> write_opt_summary does not require return_callees_map).
>>>
>>> However with fat LTO, it seems the sequence changes for LGEN with
>>> execute phase takes place after write_summary. However since
>>> return_callees_map is freed in pure_const_write_summary and
>>> propagate_malloc() accesses it in execute stage, it results in
>>> segmentation fault.
>>>
>>> To work around this, I am using the following hack in 
>>> pure_const_write_summary:
>>> // FIXME: Do not free if -ffat-lto-objects is enabled.
>>> if (!global_options.x_flag_fat_lto_objects)
>>>   free_return_callees_map ();
>>> Is there a better approach for handling this ?
>>
>> I think most passes just do not free summaries with -flto.  We probably 
>> want
>> to fix it to make it possible to compile multiple units i.e. from plugin 
>> by
>> adding release_summaries method...
>> So I would say it is OK to do the same as others do and leak it with 
>> -flto.
>>> diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
>>> index e457166ea39..724c26e03f6 100644
>>> --- a/gcc/ipa-pure-const.c
>>> +++ b/gcc/ipa-pure-const.c
>>> @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
>>>  #include "tree-scalar-evolution.h"
>>>  #include "intl.h"
>>>  #include "opts.h"
>>> +#include "ssa.h"
>>>
>>>  /* Lattice values for const and pure functions.  Everything starts out
>>> being const, then may drop to pure and then neither depending on
>>> @@ -69,6 +70,15 @@ enum pure_const_state_e
>>>
>>>  const char *pure_const_names[3] = {"const", "pure", "neither"};
>>>
>>> +enum malloc_state_e
>>> +{
>>> +  PURE_CONST_MALLOC_TOP,
>>> +  PURE_CONST_MALLOC,
>>> +  PURE_CONST_MALLOC_BOTTOM
>>> +};
>>
>> It took me a while to work out what PURE_CONST means here :)
>> I would just call it something like STATE_MALLOC_TOP... or so.
>> ipa_pure_const is outdated name from the time pass was doing only
>> those two.
>>> @@ -109,6 +121,10 @@ typedef struct funct_state_d * funct_state;
>>>
>>>  static vec funct_state_vec;
>>>
>>> +/* A map from node to subset of callees. The subset contains those 
>>> callees
>>> + * whose return-value is returned by the node. */
>>> +static hash_map< cgraph_node *, vec* > 
>>> *return_callees_map;
>>> +
>>
>> Hehe, a special case of return jump function.  We ought to support those 
>> more generally.
>> How do you keep it up to date over callgraph changes?
>>> @@ -921,6 +1055,23 @@ end:
>>>if (TREE_NOTHROW (decl))
>>>  l->can_throw = false;
>>>
>>> +  if (ipa)
>>> +{
>>> +  vec v = vNULL;
>>> +  l->malloc_state = PURE_CONST_MALLOC_BOTTOM;
>>> +  if (DECL_IS_MALLOC (decl))
>>> + l->malloc_state = PURE_CONST_MALLOC;
>>> +  else if (malloc_candidate_p (DECL_STRUCT_FUNCTION (decl), v))
>>> + {
>>> +   l->malloc_state = PURE_CONST_MALLOC_TOP;
>>> +   vec *callees_p = new vec (vNULL);
>>> +   for (unsigned i = 0; i < v.length (); ++i)
>>> + callees_p->safe_push (v[i]);
>>> +   return_callees_map->put (fn, callees_p);
>>> + }
>>> +  v.release ();
>>> +}
>>> +
>>
>> I would do non-ipa variant, too.  I think most attributes can be 
>> detected that way
>> as well.
>>
>> The patch generally makes sense to me.  It would be nice to make it 
>> easier to write such
>> a basic propagators across callgraph (perhaps adding a template doing 
>> the basic
>> propagation logic). Also I think you need to solve the problem with 
>> keeping your
>> summaries up to date across callgraph node removal and duplications.
> Thanks for the suggestions, I will try to 

Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-08-31 Thread Prathamesh Kulkarni
On 17 August 2017 at 18:02, Prathamesh Kulkarni
 wrote:
> On 8 August 2017 at 09:50, Prathamesh Kulkarni
>  wrote:
>> On 31 July 2017 at 23:53, Prathamesh Kulkarni
>>  wrote:
>>> On 23 May 2017 at 19:10, Prathamesh Kulkarni
>>>  wrote:
 On 19 May 2017 at 19:02, Jan Hubicka  wrote:
>>
>> * LTO and memory management
>> This is a general question about LTO and memory management.
>> IIUC the following sequence takes place during normal LTO:
>> LGEN: generate_summary, write_summary
>> WPA: read_summary, execute ipa passes, write_opt_summary
>>
>> So I assumed it was OK in LGEN to allocate return_callees_map in
>> generate_summary and free it in write_summary and during WPA, allocate
>> return_callees_map in read_summary and free it after execute (since
>> write_opt_summary does not require return_callees_map).
>>
>> However with fat LTO, it seems the sequence changes for LGEN with
>> execute phase takes place after write_summary. However since
>> return_callees_map is freed in pure_const_write_summary and
>> propagate_malloc() accesses it in execute stage, it results in
>> segmentation fault.
>>
>> To work around this, I am using the following hack in 
>> pure_const_write_summary:
>> // FIXME: Do not free if -ffat-lto-objects is enabled.
>> if (!global_options.x_flag_fat_lto_objects)
>>   free_return_callees_map ();
>> Is there a better approach for handling this ?
>
> I think most passes just do not free summaries with -flto.  We probably 
> want
> to fix it to make it possible to compile multiple units i.e. from plugin 
> by
> adding release_summaries method...
> So I would say it is OK to do the same as others do and leak it with 
> -flto.
>> diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
>> index e457166ea39..724c26e03f6 100644
>> --- a/gcc/ipa-pure-const.c
>> +++ b/gcc/ipa-pure-const.c
>> @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
>>  #include "tree-scalar-evolution.h"
>>  #include "intl.h"
>>  #include "opts.h"
>> +#include "ssa.h"
>>
>>  /* Lattice values for const and pure functions.  Everything starts out
>> being const, then may drop to pure and then neither depending on
>> @@ -69,6 +70,15 @@ enum pure_const_state_e
>>
>>  const char *pure_const_names[3] = {"const", "pure", "neither"};
>>
>> +enum malloc_state_e
>> +{
>> +  PURE_CONST_MALLOC_TOP,
>> +  PURE_CONST_MALLOC,
>> +  PURE_CONST_MALLOC_BOTTOM
>> +};
>
> It took me a while to work out what PURE_CONST means here :)
> I would just call it something like STATE_MALLOC_TOP... or so.
> ipa_pure_const is outdated name from the time pass was doing only
> those two.
>> @@ -109,6 +121,10 @@ typedef struct funct_state_d * funct_state;
>>
>>  static vec funct_state_vec;
>>
>> +/* A map from node to subset of callees. The subset contains those 
>> callees
>> + * whose return-value is returned by the node. */
>> +static hash_map< cgraph_node *, vec* > 
>> *return_callees_map;
>> +
>
> Hehe, a special case of return jump function.  We ought to support those 
> more generally.
> How do you keep it up to date over callgraph changes?
>> @@ -921,6 +1055,23 @@ end:
>>if (TREE_NOTHROW (decl))
>>  l->can_throw = false;
>>
>> +  if (ipa)
>> +{
>> +  vec v = vNULL;
>> +  l->malloc_state = PURE_CONST_MALLOC_BOTTOM;
>> +  if (DECL_IS_MALLOC (decl))
>> + l->malloc_state = PURE_CONST_MALLOC;
>> +  else if (malloc_candidate_p (DECL_STRUCT_FUNCTION (decl), v))
>> + {
>> +   l->malloc_state = PURE_CONST_MALLOC_TOP;
>> +   vec *callees_p = new vec (vNULL);
>> +   for (unsigned i = 0; i < v.length (); ++i)
>> + callees_p->safe_push (v[i]);
>> +   return_callees_map->put (fn, callees_p);
>> + }
>> +  v.release ();
>> +}
>> +
>
> I would do non-ipa variant, too.  I think most attributes can be detected 
> that way
> as well.
>
> The patch generally makes sense to me.  It would be nice to make it 
> easier to write such
> a basic propagators across callgraph (perhaps adding a template doing the 
> basic
> propagation logic). Also I think you need to solve the problem with 
> keeping your
> summaries up to date across callgraph node removal and duplications.
 Thanks for the suggestions, I will try to address them in a follow-up 
 patch.
 IIUC, I would need to modify ipa-pure-const cgraph hooks -
 add_new_function, remove_node_data, duplicate_node_data
 to keep return_callees_map 

Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-08-17 Thread Prathamesh Kulkarni
On 8 August 2017 at 09:50, Prathamesh Kulkarni
 wrote:
> On 31 July 2017 at 23:53, Prathamesh Kulkarni
>  wrote:
>> On 23 May 2017 at 19:10, Prathamesh Kulkarni
>>  wrote:
>>> On 19 May 2017 at 19:02, Jan Hubicka  wrote:
>
> * LTO and memory management
> This is a general question about LTO and memory management.
> IIUC the following sequence takes place during normal LTO:
> LGEN: generate_summary, write_summary
> WPA: read_summary, execute ipa passes, write_opt_summary
>
> So I assumed it was OK in LGEN to allocate return_callees_map in
> generate_summary and free it in write_summary and during WPA, allocate
> return_callees_map in read_summary and free it after execute (since
> write_opt_summary does not require return_callees_map).
>
> However with fat LTO, it seems the sequence changes for LGEN with
> execute phase takes place after write_summary. However since
> return_callees_map is freed in pure_const_write_summary and
> propagate_malloc() accesses it in execute stage, it results in
> segmentation fault.
>
> To work around this, I am using the following hack in 
> pure_const_write_summary:
> // FIXME: Do not free if -ffat-lto-objects is enabled.
> if (!global_options.x_flag_fat_lto_objects)
>   free_return_callees_map ();
> Is there a better approach for handling this ?

 I think most passes just do not free summaries with -flto.  We probably 
 want
 to fix it to make it possible to compile multiple units i.e. from plugin by
 adding release_summaries method...
 So I would say it is OK to do the same as others do and leak it with -flto.
> diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
> index e457166ea39..724c26e03f6 100644
> --- a/gcc/ipa-pure-const.c
> +++ b/gcc/ipa-pure-const.c
> @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-scalar-evolution.h"
>  #include "intl.h"
>  #include "opts.h"
> +#include "ssa.h"
>
>  /* Lattice values for const and pure functions.  Everything starts out
> being const, then may drop to pure and then neither depending on
> @@ -69,6 +70,15 @@ enum pure_const_state_e
>
>  const char *pure_const_names[3] = {"const", "pure", "neither"};
>
> +enum malloc_state_e
> +{
> +  PURE_CONST_MALLOC_TOP,
> +  PURE_CONST_MALLOC,
> +  PURE_CONST_MALLOC_BOTTOM
> +};

 It took me a while to work out what PURE_CONST means here :)
 I would just call it something like STATE_MALLOC_TOP... or so.
 ipa_pure_const is outdated name from the time pass was doing only
 those two.
> @@ -109,6 +121,10 @@ typedef struct funct_state_d * funct_state;
>
>  static vec funct_state_vec;
>
> +/* A map from node to subset of callees. The subset contains those 
> callees
> + * whose return-value is returned by the node. */
> +static hash_map< cgraph_node *, vec* > 
> *return_callees_map;
> +

 Hehe, a special case of return jump function.  We ought to support those 
 more generally.
 How do you keep it up to date over callgraph changes?
> @@ -921,6 +1055,23 @@ end:
>if (TREE_NOTHROW (decl))
>  l->can_throw = false;
>
> +  if (ipa)
> +{
> +  vec v = vNULL;
> +  l->malloc_state = PURE_CONST_MALLOC_BOTTOM;
> +  if (DECL_IS_MALLOC (decl))
> + l->malloc_state = PURE_CONST_MALLOC;
> +  else if (malloc_candidate_p (DECL_STRUCT_FUNCTION (decl), v))
> + {
> +   l->malloc_state = PURE_CONST_MALLOC_TOP;
> +   vec *callees_p = new vec (vNULL);
> +   for (unsigned i = 0; i < v.length (); ++i)
> + callees_p->safe_push (v[i]);
> +   return_callees_map->put (fn, callees_p);
> + }
> +  v.release ();
> +}
> +

 I would do non-ipa variant, too.  I think most attributes can be detected 
 that way
 as well.

 The patch generally makes sense to me.  It would be nice to make it easier 
 to write such
 a basic propagators across callgraph (perhaps adding a template doing the 
 basic
 propagation logic). Also I think you need to solve the problem with 
 keeping your
 summaries up to date across callgraph node removal and duplications.
>>> Thanks for the suggestions, I will try to address them in a follow-up patch.
>>> IIUC, I would need to modify ipa-pure-const cgraph hooks -
>>> add_new_function, remove_node_data, duplicate_node_data
>>> to keep return_callees_map up-to-date across callgraph node insertions
>>> and removal ?
>>>
>>> Also, if instead of having a separate data-structure like 
>>> return_callees_map,
>>> should we rather have a flag within cgraph_edge, which marks that the
>>> 

Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-08-07 Thread Prathamesh Kulkarni
On 31 July 2017 at 23:53, Prathamesh Kulkarni
 wrote:
> On 23 May 2017 at 19:10, Prathamesh Kulkarni
>  wrote:
>> On 19 May 2017 at 19:02, Jan Hubicka  wrote:

 * LTO and memory management
 This is a general question about LTO and memory management.
 IIUC the following sequence takes place during normal LTO:
 LGEN: generate_summary, write_summary
 WPA: read_summary, execute ipa passes, write_opt_summary

 So I assumed it was OK in LGEN to allocate return_callees_map in
 generate_summary and free it in write_summary and during WPA, allocate
 return_callees_map in read_summary and free it after execute (since
 write_opt_summary does not require return_callees_map).

 However with fat LTO, it seems the sequence changes for LGEN with
 execute phase takes place after write_summary. However since
 return_callees_map is freed in pure_const_write_summary and
 propagate_malloc() accesses it in execute stage, it results in
 segmentation fault.

 To work around this, I am using the following hack in 
 pure_const_write_summary:
 // FIXME: Do not free if -ffat-lto-objects is enabled.
 if (!global_options.x_flag_fat_lto_objects)
   free_return_callees_map ();
 Is there a better approach for handling this ?
>>>
>>> I think most passes just do not free summaries with -flto.  We probably want
>>> to fix it to make it possible to compile multiple units i.e. from plugin by
>>> adding release_summaries method...
>>> So I would say it is OK to do the same as others do and leak it with -flto.
 diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
 index e457166ea39..724c26e03f6 100644
 --- a/gcc/ipa-pure-const.c
 +++ b/gcc/ipa-pure-const.c
 @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
  #include "tree-scalar-evolution.h"
  #include "intl.h"
  #include "opts.h"
 +#include "ssa.h"

  /* Lattice values for const and pure functions.  Everything starts out
 being const, then may drop to pure and then neither depending on
 @@ -69,6 +70,15 @@ enum pure_const_state_e

  const char *pure_const_names[3] = {"const", "pure", "neither"};

 +enum malloc_state_e
 +{
 +  PURE_CONST_MALLOC_TOP,
 +  PURE_CONST_MALLOC,
 +  PURE_CONST_MALLOC_BOTTOM
 +};
>>>
>>> It took me a while to work out what PURE_CONST means here :)
>>> I would just call it something like STATE_MALLOC_TOP... or so.
>>> ipa_pure_const is outdated name from the time pass was doing only
>>> those two.
 @@ -109,6 +121,10 @@ typedef struct funct_state_d * funct_state;

  static vec funct_state_vec;

 +/* A map from node to subset of callees. The subset contains those callees
 + * whose return-value is returned by the node. */
 +static hash_map< cgraph_node *, vec* > *return_callees_map;
 +
>>>
>>> Hehe, a special case of return jump function.  We ought to support those 
>>> more generally.
>>> How do you keep it up to date over callgraph changes?
 @@ -921,6 +1055,23 @@ end:
if (TREE_NOTHROW (decl))
  l->can_throw = false;

 +  if (ipa)
 +{
 +  vec v = vNULL;
 +  l->malloc_state = PURE_CONST_MALLOC_BOTTOM;
 +  if (DECL_IS_MALLOC (decl))
 + l->malloc_state = PURE_CONST_MALLOC;
 +  else if (malloc_candidate_p (DECL_STRUCT_FUNCTION (decl), v))
 + {
 +   l->malloc_state = PURE_CONST_MALLOC_TOP;
 +   vec *callees_p = new vec (vNULL);
 +   for (unsigned i = 0; i < v.length (); ++i)
 + callees_p->safe_push (v[i]);
 +   return_callees_map->put (fn, callees_p);
 + }
 +  v.release ();
 +}
 +
>>>
>>> I would do non-ipa variant, too.  I think most attributes can be detected 
>>> that way
>>> as well.
>>>
>>> The patch generally makes sense to me.  It would be nice to make it easier 
>>> to write such
>>> a basic propagators across callgraph (perhaps adding a template doing the 
>>> basic
>>> propagation logic). Also I think you need to solve the problem with keeping 
>>> your
>>> summaries up to date across callgraph node removal and duplications.
>> Thanks for the suggestions, I will try to address them in a follow-up patch.
>> IIUC, I would need to modify ipa-pure-const cgraph hooks -
>> add_new_function, remove_node_data, duplicate_node_data
>> to keep return_callees_map up-to-date across callgraph node insertions
>> and removal ?
>>
>> Also, if instead of having a separate data-structure like return_callees_map,
>> should we rather have a flag within cgraph_edge, which marks that the
>> caller may return the value of the callee ?
> Hi,
> Sorry for the very late response. I have attached an updated version
> of the prototype patch,
> which adds a non-ipa variant, and keeps return_callees_map up-to-date
> across 

Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-07-31 Thread Prathamesh Kulkarni
On 23 May 2017 at 19:10, Prathamesh Kulkarni
 wrote:
> On 19 May 2017 at 19:02, Jan Hubicka  wrote:
>>>
>>> * LTO and memory management
>>> This is a general question about LTO and memory management.
>>> IIUC the following sequence takes place during normal LTO:
>>> LGEN: generate_summary, write_summary
>>> WPA: read_summary, execute ipa passes, write_opt_summary
>>>
>>> So I assumed it was OK in LGEN to allocate return_callees_map in
>>> generate_summary and free it in write_summary and during WPA, allocate
>>> return_callees_map in read_summary and free it after execute (since
>>> write_opt_summary does not require return_callees_map).
>>>
>>> However with fat LTO, it seems the sequence changes for LGEN with
>>> execute phase takes place after write_summary. However since
>>> return_callees_map is freed in pure_const_write_summary and
>>> propagate_malloc() accesses it in execute stage, it results in
>>> segmentation fault.
>>>
>>> To work around this, I am using the following hack in 
>>> pure_const_write_summary:
>>> // FIXME: Do not free if -ffat-lto-objects is enabled.
>>> if (!global_options.x_flag_fat_lto_objects)
>>>   free_return_callees_map ();
>>> Is there a better approach for handling this ?
>>
>> I think most passes just do not free summaries with -flto.  We probably want
>> to fix it to make it possible to compile multiple units i.e. from plugin by
>> adding release_summaries method...
>> So I would say it is OK to do the same as others do and leak it with -flto.
>>> diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
>>> index e457166ea39..724c26e03f6 100644
>>> --- a/gcc/ipa-pure-const.c
>>> +++ b/gcc/ipa-pure-const.c
>>> @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
>>>  #include "tree-scalar-evolution.h"
>>>  #include "intl.h"
>>>  #include "opts.h"
>>> +#include "ssa.h"
>>>
>>>  /* Lattice values for const and pure functions.  Everything starts out
>>> being const, then may drop to pure and then neither depending on
>>> @@ -69,6 +70,15 @@ enum pure_const_state_e
>>>
>>>  const char *pure_const_names[3] = {"const", "pure", "neither"};
>>>
>>> +enum malloc_state_e
>>> +{
>>> +  PURE_CONST_MALLOC_TOP,
>>> +  PURE_CONST_MALLOC,
>>> +  PURE_CONST_MALLOC_BOTTOM
>>> +};
>>
>> It took me a while to work out what PURE_CONST means here :)
>> I would just call it something like STATE_MALLOC_TOP... or so.
>> ipa_pure_const is outdated name from the time pass was doing only
>> those two.
>>> @@ -109,6 +121,10 @@ typedef struct funct_state_d * funct_state;
>>>
>>>  static vec funct_state_vec;
>>>
>>> +/* A map from node to subset of callees. The subset contains those callees
>>> + * whose return-value is returned by the node. */
>>> +static hash_map< cgraph_node *, vec* > *return_callees_map;
>>> +
>>
>> Hehe, a special case of return jump function.  We ought to support those 
>> more generally.
>> How do you keep it up to date over callgraph changes?
>>> @@ -921,6 +1055,23 @@ end:
>>>if (TREE_NOTHROW (decl))
>>>  l->can_throw = false;
>>>
>>> +  if (ipa)
>>> +{
>>> +  vec v = vNULL;
>>> +  l->malloc_state = PURE_CONST_MALLOC_BOTTOM;
>>> +  if (DECL_IS_MALLOC (decl))
>>> + l->malloc_state = PURE_CONST_MALLOC;
>>> +  else if (malloc_candidate_p (DECL_STRUCT_FUNCTION (decl), v))
>>> + {
>>> +   l->malloc_state = PURE_CONST_MALLOC_TOP;
>>> +   vec *callees_p = new vec (vNULL);
>>> +   for (unsigned i = 0; i < v.length (); ++i)
>>> + callees_p->safe_push (v[i]);
>>> +   return_callees_map->put (fn, callees_p);
>>> + }
>>> +  v.release ();
>>> +}
>>> +
>>
>> I would do non-ipa variant, too.  I think most attributes can be detected 
>> that way
>> as well.
>>
>> The patch generally makes sense to me.  It would be nice to make it easier 
>> to write such
>> a basic propagators across callgraph (perhaps adding a template doing the 
>> basic
>> propagation logic). Also I think you need to solve the problem with keeping 
>> your
>> summaries up to date across callgraph node removal and duplications.
> Thanks for the suggestions, I will try to address them in a follow-up patch.
> IIUC, I would need to modify ipa-pure-const cgraph hooks -
> add_new_function, remove_node_data, duplicate_node_data
> to keep return_callees_map up-to-date across callgraph node insertions
> and removal ?
>
> Also, if instead of having a separate data-structure like return_callees_map,
> should we rather have a flag within cgraph_edge, which marks that the
> caller may return the value of the callee ?
Hi,
Sorry for the very late response. I have attached an updated version
of the prototype patch,
which adds a non-ipa variant, and keeps return_callees_map up-to-date
across callgraph
node insertions and removal. For the non-ipa variant,
malloc_candidate_p() additionally checks
that all the "return callees" have DECL_IS_MALLOC set to true.
Bootstrapped+tested and LTO bootstrapped+tested 

Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-05-23 Thread Prathamesh Kulkarni
On 19 May 2017 at 19:02, Jan Hubicka  wrote:
>>
>> * LTO and memory management
>> This is a general question about LTO and memory management.
>> IIUC the following sequence takes place during normal LTO:
>> LGEN: generate_summary, write_summary
>> WPA: read_summary, execute ipa passes, write_opt_summary
>>
>> So I assumed it was OK in LGEN to allocate return_callees_map in
>> generate_summary and free it in write_summary and during WPA, allocate
>> return_callees_map in read_summary and free it after execute (since
>> write_opt_summary does not require return_callees_map).
>>
>> However with fat LTO, it seems the sequence changes for LGEN with
>> execute phase takes place after write_summary. However since
>> return_callees_map is freed in pure_const_write_summary and
>> propagate_malloc() accesses it in execute stage, it results in
>> segmentation fault.
>>
>> To work around this, I am using the following hack in 
>> pure_const_write_summary:
>> // FIXME: Do not free if -ffat-lto-objects is enabled.
>> if (!global_options.x_flag_fat_lto_objects)
>>   free_return_callees_map ();
>> Is there a better approach for handling this ?
>
> I think most passes just do not free summaries with -flto.  We probably want
> to fix it to make it possible to compile multiple units i.e. from plugin by
> adding release_summaries method...
> So I would say it is OK to do the same as others do and leak it with -flto.
>> diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
>> index e457166ea39..724c26e03f6 100644
>> --- a/gcc/ipa-pure-const.c
>> +++ b/gcc/ipa-pure-const.c
>> @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
>>  #include "tree-scalar-evolution.h"
>>  #include "intl.h"
>>  #include "opts.h"
>> +#include "ssa.h"
>>
>>  /* Lattice values for const and pure functions.  Everything starts out
>> being const, then may drop to pure and then neither depending on
>> @@ -69,6 +70,15 @@ enum pure_const_state_e
>>
>>  const char *pure_const_names[3] = {"const", "pure", "neither"};
>>
>> +enum malloc_state_e
>> +{
>> +  PURE_CONST_MALLOC_TOP,
>> +  PURE_CONST_MALLOC,
>> +  PURE_CONST_MALLOC_BOTTOM
>> +};
>
> It took me a while to work out what PURE_CONST means here :)
> I would just call it something like STATE_MALLOC_TOP... or so.
> ipa_pure_const is outdated name from the time pass was doing only
> those two.
>> @@ -109,6 +121,10 @@ typedef struct funct_state_d * funct_state;
>>
>>  static vec funct_state_vec;
>>
>> +/* A map from node to subset of callees. The subset contains those callees
>> + * whose return-value is returned by the node. */
>> +static hash_map< cgraph_node *, vec* > *return_callees_map;
>> +
>
> Hehe, a special case of return jump function.  We ought to support those more 
> generally.
> How do you keep it up to date over callgraph changes?
>> @@ -921,6 +1055,23 @@ end:
>>if (TREE_NOTHROW (decl))
>>  l->can_throw = false;
>>
>> +  if (ipa)
>> +{
>> +  vec v = vNULL;
>> +  l->malloc_state = PURE_CONST_MALLOC_BOTTOM;
>> +  if (DECL_IS_MALLOC (decl))
>> + l->malloc_state = PURE_CONST_MALLOC;
>> +  else if (malloc_candidate_p (DECL_STRUCT_FUNCTION (decl), v))
>> + {
>> +   l->malloc_state = PURE_CONST_MALLOC_TOP;
>> +   vec *callees_p = new vec (vNULL);
>> +   for (unsigned i = 0; i < v.length (); ++i)
>> + callees_p->safe_push (v[i]);
>> +   return_callees_map->put (fn, callees_p);
>> + }
>> +  v.release ();
>> +}
>> +
>
> I would do non-ipa variant, too.  I think most attributes can be detected 
> that way
> as well.
>
> The patch generally makes sense to me.  It would be nice to make it easier to 
> write such
> a basic propagators across callgraph (perhaps adding a template doing the 
> basic
> propagation logic). Also I think you need to solve the problem with keeping 
> your
> summaries up to date across callgraph node removal and duplications.
Thanks for the suggestions, I will try to address them in a follow-up patch.
IIUC, I would need to modify ipa-pure-const cgraph hooks -
add_new_function, remove_node_data, duplicate_node_data
to keep return_callees_map up-to-date across callgraph node insertions
and removal ?

Also, if instead of having a separate data-structure like return_callees_map,
should we rather have a flag within cgraph_edge, which marks that the
caller may return the value of the callee ?

Thanks,
Prathamesh
>
> Honza


Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-05-19 Thread Jan Hubicka
> 
> * LTO and memory management
> This is a general question about LTO and memory management.
> IIUC the following sequence takes place during normal LTO:
> LGEN: generate_summary, write_summary
> WPA: read_summary, execute ipa passes, write_opt_summary
> 
> So I assumed it was OK in LGEN to allocate return_callees_map in
> generate_summary and free it in write_summary and during WPA, allocate
> return_callees_map in read_summary and free it after execute (since
> write_opt_summary does not require return_callees_map).
> 
> However with fat LTO, it seems the sequence changes for LGEN with
> execute phase takes place after write_summary. However since
> return_callees_map is freed in pure_const_write_summary and
> propagate_malloc() accesses it in execute stage, it results in
> segmentation fault.
> 
> To work around this, I am using the following hack in 
> pure_const_write_summary:
> // FIXME: Do not free if -ffat-lto-objects is enabled.
> if (!global_options.x_flag_fat_lto_objects)
>   free_return_callees_map ();
> Is there a better approach for handling this ?

I think most passes just do not free summaries with -flto.  We probably want
to fix it to make it possible to compile multiple units i.e. from plugin by
adding release_summaries method...
So I would say it is OK to do the same as others do and leak it with -flto.
> diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
> index e457166ea39..724c26e03f6 100644
> --- a/gcc/ipa-pure-const.c
> +++ b/gcc/ipa-pure-const.c
> @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-scalar-evolution.h"
>  #include "intl.h"
>  #include "opts.h"
> +#include "ssa.h"
>  
>  /* Lattice values for const and pure functions.  Everything starts out
> being const, then may drop to pure and then neither depending on
> @@ -69,6 +70,15 @@ enum pure_const_state_e
>  
>  const char *pure_const_names[3] = {"const", "pure", "neither"};
>  
> +enum malloc_state_e
> +{
> +  PURE_CONST_MALLOC_TOP,
> +  PURE_CONST_MALLOC,
> +  PURE_CONST_MALLOC_BOTTOM
> +};

It took me a while to work out what PURE_CONST means here :)
I would just call it something like STATE_MALLOC_TOP... or so.
ipa_pure_const is outdated name from the time pass was doing only
those two.
> @@ -109,6 +121,10 @@ typedef struct funct_state_d * funct_state;
>  
>  static vec funct_state_vec;
>  
> +/* A map from node to subset of callees. The subset contains those callees
> + * whose return-value is returned by the node. */
> +static hash_map< cgraph_node *, vec* > *return_callees_map;
> +

Hehe, a special case of return jump function.  We ought to support those more 
generally.
How do you keep it up to date over callgraph changes?
> @@ -921,6 +1055,23 @@ end:
>if (TREE_NOTHROW (decl))
>  l->can_throw = false;
>  
> +  if (ipa)
> +{
> +  vec v = vNULL;
> +  l->malloc_state = PURE_CONST_MALLOC_BOTTOM;
> +  if (DECL_IS_MALLOC (decl))
> + l->malloc_state = PURE_CONST_MALLOC;
> +  else if (malloc_candidate_p (DECL_STRUCT_FUNCTION (decl), v))
> + {
> +   l->malloc_state = PURE_CONST_MALLOC_TOP;
> +   vec *callees_p = new vec (vNULL);
> +   for (unsigned i = 0; i < v.length (); ++i)
> + callees_p->safe_push (v[i]);
> +   return_callees_map->put (fn, callees_p);
> + }
> +  v.release ();
> +}
> +

I would do non-ipa variant, too.  I think most attributes can be detected that 
way
as well.

The patch generally makes sense to me.  It would be nice to make it easier to 
write such
a basic propagators across callgraph (perhaps adding a template doing the basic
propagation logic). Also I think you need to solve the problem with keeping your
summaries up to date across callgraph node removal and duplications.

Honza


Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-05-19 Thread Jan Hubicka
> >   struct D: B {
> > char buf[32];
> > virtual void* f (unsigned n) {
> >   if (n < 32)
> >   return n <= 32 ? buf : B::f (n);
> > }
> > 
> > Breaking foo's attribute malloc constraint.
> > 
> > In other words, I think virtual functions need to be excluded
> > from the list (unless they're defined in a class marked final,
> > or unless we know they're not overridden to break the constraint
> > like above).
> 
> But we are annotating the actual decl, not the type in the class
> struct.

Yep and we will be able to use the information in case we devirtualize
the call.  So this seems OK to me.

Honza
> 
> Richard.


Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-05-18 Thread Richard Biener
On Wed, 17 May 2017, Martin Sebor wrote:

> > The patch passes bootstrap+test on x86_64 and found a few functions in
> > the source tree (attached func_names.txt) that could be annotated with
> > malloc (I gave a brief look at some of the functions and didn't appear
> > to be false positives but I will recheck thoroughly)
> 
> virtual char* libcp1::compiler::find(std::__cxx11::string&) const
> 
> The virtual on the list of your candidates gave me pause.  Consider
> this completely contrived example:
> 
>   struct B {
> virtual void* f (unsigned n) {
>   return new char [n];
> }
>   };
> 
>   void* foo (B , unsigned n)
>   {
> return b.f (n);
>   }
> 
> Based on these definitions alone both functions are candidates
> for attribute malloc.
> 
> But suppose foo is called with an object of a type derived from
> B that overrides f() to do something wacky (but strictly not
> invalid) like:
> 
>   struct D: B {
> char buf[32];
> virtual void* f (unsigned n) {
>   if (n < 32)
>   return n <= 32 ? buf : B::f (n);
> }
> 
> Breaking foo's attribute malloc constraint.
> 
> In other words, I think virtual functions need to be excluded
> from the list (unless they're defined in a class marked final,
> or unless we know they're not overridden to break the constraint
> like above).

But we are annotating the actual decl, not the type in the class
struct.

Richard.


Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-05-17 Thread Martin Sebor

The patch passes bootstrap+test on x86_64 and found a few functions in
the source tree (attached func_names.txt) that could be annotated with
malloc (I gave a brief look at some of the functions and didn't appear
to be false positives but I will recheck thoroughly)


virtual char* libcp1::compiler::find(std::__cxx11::string&) const

The virtual on the list of your candidates gave me pause.  Consider
this completely contrived example:

  struct B {
virtual void* f (unsigned n) {
  return new char [n];
}
  };

  void* foo (B , unsigned n)
  {
return b.f (n);
  }

Based on these definitions alone both functions are candidates
for attribute malloc.

But suppose foo is called with an object of a type derived from
B that overrides f() to do something wacky (but strictly not
invalid) like:

  struct D: B {
char buf[32];
virtual void* f (unsigned n) {
  if (n < 32)
  return n <= 32 ? buf : B::f (n);
}

Breaking foo's attribute malloc constraint.

In other words, I think virtual functions need to be excluded
from the list (unless they're defined in a class marked final,
or unless we know they're not overridden to break the constraint
like above).

Martin


Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-05-16 Thread Richard Biener
On Mon, 15 May 2017, Prathamesh Kulkarni wrote:

> Hi,
> I have attached a bare-bones prototype patch that propagates malloc attribute 
> in
> ipa-pure-const. As far as I understand, from the doc a function could
> be annotated
> with malloc attribute if it returns a pointer to a newly allocated
> memory block (uninitialized or zeroed) and the pointer does not alias
> any other valid pointer ?
> 
> * Analysis
> The analysis used by the patch in malloc_candidate_p is too conservative,
> and I would be grateful for suggestions on improving it.
> Currently it only checks if:
> 1) The function returns a value of pointer type.
> 2) SSA_NAME_DEF_STMT (return_value) is either a function call or a phi, and
> SSA_NAME_DEF_STMT(each element of phi) is function call.
> 3) The return-value has immediate uses only within comparisons (gcond
> or gassign) and return_stmt (and likewise a phi arg has immediate use only
> within comparison or the phi stmt).
>
> The intent is that malloc_candidate_p(node) returns true if node
> returns the return value of it's callee, and the return value is only
> used for comparisons within node.
> Then I assume it's safe (although conservative) to decide that node
> could possibly be a malloc function if callee is found to be malloc ?
> 
> for eg:
> void *f(size_t n)
> {
>   void *p = __builtin_malloc (n);
>   if (p == 0)
> __builtin_abort ();
>   return p;
> }
> 
> malloc_candidate_p would determine f to be a candidate for malloc
> attribute since it returns the return-value of it's callee
> (__builtin_malloc) and the return value is used only within comparison
> and return_stmt.
> 
> However due to the imprecise analysis it misses this case:
> char  **g(size_t n)
> {
>   char **p = malloc (sizeof(char *) * 10);
>   for (int i = 0; i < 10; i++)
> p[i] = malloc (sizeof(char) * n);
>   return p;
> }
> I suppose g() could be annotated with malloc here ?

It cannot because what p points to is initialized.  If you do then

 char **x = g(10);
 **x = 1;

will be optimized away.  Now what would be interesting is to do
"poor mans IPA PTA", namely identify functions that are a) small,
b) likely return a newly allocated pointer.  At PTA time then
"inline" all such wrappers, but only for the PTA constraints.

That will give more precise points-to results (and can handle more
cases, esp. initialization) than just annotating functions with 'malloc'.

+  /* With non-call exceptions we can't say for sure if other function
+body was not possibly optimized to still throw.  */
+  if (!non_call || node->binds_to_current_def_p ())
+   {
+ DECL_IS_MALLOC (node->decl) = true;
+ *changed = true;

I don't see how malloc throwing would be an issue.

+  FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, retval)
+if (gcond *cond = dyn_cast (use_stmt))
+  {
+   enum tree_code code = gimple_cond_code (cond);
+   if (TREE_CODE_CLASS (code) != tcc_comparison)

always false

+ RETURN_FROM_IMM_USE_STMT (use_iter, false);

I think you want to disallow eq/ne_expr against sth not integer_zerop.

+else if (gassign *ga = dyn_cast (use_stmt))
+  {
+   enum tree_code code = gimple_assign_rhs_code (ga);
+   if (TREE_CODE_CLASS (code) != tcc_comparison)
+ RETURN_FROM_IMM_USE_STMT (use_iter, false);

likewise.

+static bool
+malloc_candidate_p (function *fun, vec& callees)
+{
+  basic_block bb;
+  FOR_EACH_BB_FN (bb, fun)
+{
+  for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
+  !gsi_end_p (gsi);
+  gsi_next ())
+   {
+ greturn *ret_stmt = dyn_cast (gsi_stmt (gsi));
+ if (ret_stmt)

I think you want do do this much cheaper by only walking the last
stmt of predecessor(s) of EXIT_BLOCK_FOR_FN.  (s) because
we have multiple return stmts for

int foo (int i)
{
  if (i)
return;
  else
return i;
}

but all return stmts will be the last stmt in one of the exit block
predecessors.

+ if (!check_retval_uses (retval, ret_stmt))
+   return false;
+
+ gimple *def = SSA_NAME_DEF_STMT (retval);
+ if (gcall *call_stmt = dyn_cast (def))
+   {
+ tree callee_decl = gimple_call_fndecl (call_stmt);
+ /* FIXME: In case of indirect call, callee_decl might be 
NULL
+Not sure what to do in that case, punting for now.  
*/
+ if (!callee_decl)
+   return false;
+ cgraph_node *callee = cgraph_node::get_create 
(callee_decl);
+ callees.safe_push (callee);
+   }
+ else if (gphi *phi = dyn_cast (def))
+   for (unsigned i = 0; i < gimple_phi_num_args (phi); ++i)
+ {
+   tree arg = gimple_phi_arg_def (phi, i);
+   if (TREE_CODE (arg) != SSA_NAME)
+ return false;
+   if (!check_retval_uses (arg, phi))
+ return 

Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-05-15 Thread Jeff Law

On 05/15/2017 04:39 AM, Prathamesh Kulkarni wrote:

Hi,
I have attached a bare-bones prototype patch that propagates malloc attribute in
ipa-pure-const. As far as I understand, from the doc a function could
be annotated
with malloc attribute if it returns a pointer to a newly allocated
memory block (uninitialized or zeroed) and the pointer does not alias
any other valid pointer ?

* Analysis
The analysis used by the patch in malloc_candidate_p is too conservative,
and I would be grateful for suggestions on improving it.
Currently it only checks if:
1) The function returns a value of pointer type.
2) SSA_NAME_DEF_STMT (return_value) is either a function call or a phi, and
 SSA_NAME_DEF_STMT(each element of phi) is function call.
3) The return-value has immediate uses only within comparisons (gcond
or gassign) and return_stmt (and likewise a phi arg has immediate use only
within comparison or the phi stmt).
ISTM the return value *must* be either the result of a call to a 
function with the malloc attribute or NULL.  It can't be a mix of 
malloc'd objects or something else (such as a passed-in object).




malloc_candidate_p would determine f to be a candidate for malloc
attribute since it returns the return-value of it's callee
(__builtin_malloc) and the return value is used only within comparison
and return_stmt.

However due to the imprecise analysis it misses this case:
char  **g(size_t n)
{
   char **p = malloc (sizeof(char *) * 10);
   for (int i = 0; i < 10; i++)
 p[i] = malloc (sizeof(char) * n);
   return p;
}
I suppose g() could be annotated with malloc here ?
I think that's up to us to decide.  So the question becomes what makes 
the most sense from a user and optimization standpoint.



I would start relatively conservatively and look to add further analysis 
to detect more malloc functions over time.  You've already identified 
comparisons as valid uses, but ISTM the pointer could be passed as an 
argument, stored into memory and all kinds of other things.


You'll probably want instrumentation to log cases where something looked 
like a potential candidate, but had to be rejected for some reason.


Jeff


[RFC] propagate malloc attribute in ipa-pure-const pass

2017-05-15 Thread Prathamesh Kulkarni
Hi,
I have attached a bare-bones prototype patch that propagates malloc attribute in
ipa-pure-const. As far as I understand, from the doc a function could
be annotated
with malloc attribute if it returns a pointer to a newly allocated
memory block (uninitialized or zeroed) and the pointer does not alias
any other valid pointer ?

* Analysis
The analysis used by the patch in malloc_candidate_p is too conservative,
and I would be grateful for suggestions on improving it.
Currently it only checks if:
1) The function returns a value of pointer type.
2) SSA_NAME_DEF_STMT (return_value) is either a function call or a phi, and
SSA_NAME_DEF_STMT(each element of phi) is function call.
3) The return-value has immediate uses only within comparisons (gcond
or gassign) and return_stmt (and likewise a phi arg has immediate use only
within comparison or the phi stmt).

The intent is that malloc_candidate_p(node) returns true if node
returns the return value of it's callee, and the return value is only
used for comparisons within node.
Then I assume it's safe (although conservative) to decide that node
could possibly be a malloc function if callee is found to be malloc ?

for eg:
void *f(size_t n)
{
  void *p = __builtin_malloc (n);
  if (p == 0)
__builtin_abort ();
  return p;
}

malloc_candidate_p would determine f to be a candidate for malloc
attribute since it returns the return-value of it's callee
(__builtin_malloc) and the return value is used only within comparison
and return_stmt.

However due to the imprecise analysis it misses this case:
char  **g(size_t n)
{
  char **p = malloc (sizeof(char *) * 10);
  for (int i = 0; i < 10; i++)
p[i] = malloc (sizeof(char) * n);
  return p;
}
I suppose g() could be annotated with malloc here ?

The patch uses return_calles_map which is a hash_map such
that the return value of node is return value of one of these callees.
For above eg it would be 
The analysis phase populates return_callees_map, and the propagation
phase uses it to take the "meet" of callees.

* LTO and memory management
This is a general question about LTO and memory management.
IIUC the following sequence takes place during normal LTO:
LGEN: generate_summary, write_summary
WPA: read_summary, execute ipa passes, write_opt_summary

So I assumed it was OK in LGEN to allocate return_callees_map in
generate_summary and free it in write_summary and during WPA, allocate
return_callees_map in read_summary and free it after execute (since
write_opt_summary does not require return_callees_map).

However with fat LTO, it seems the sequence changes for LGEN with
execute phase takes place after write_summary. However since
return_callees_map is freed in pure_const_write_summary and
propagate_malloc() accesses it in execute stage, it results in
segmentation fault.

To work around this, I am using the following hack in pure_const_write_summary:
// FIXME: Do not free if -ffat-lto-objects is enabled.
if (!global_options.x_flag_fat_lto_objects)
  free_return_callees_map ();
Is there a better approach for handling this ?

Also I am not sure if I have written cgraph_node::set_malloc_flag[_1] correctly.
I tried to imitate cgraph_node::set_const_flag.

The patch passes bootstrap+test on x86_64 and found a few functions in
the source tree (attached func_names.txt) that could be annotated with
malloc (I gave a brief look at some of the functions and didn't appear
to be false positives but I will recheck thoroughly)

Does the patch look in the right direction ?
I would be grateful for suggestions on improving it.

Thanks,
Prathamesh
virtual char* libcp1::compiler::find(std::__cxx11::string&) const
gomp_malloc
virtual char* libcc1::compiler::find(std::__cxx11::string&) const
void* GTM::xrealloc(void*, size_t, bool)
concat
char* gen_internal_sym(const char*)
void* ira_allocate(size_t)
char* gen_fake_label()
char* selftest::locate_file(const char*)
const char* find_plugindir_spec_function(int, const char**)
reconcat
xvasprintf
char* rtx_reader::read_until(const char*, bool)
_Tp* 
__gnu_cxx::__detail::__mini_vector<_Tp>::allocate(__gnu_cxx::__detail::__mini_vector<_Tp>::size_type)
 [with _Tp = long unsigned int*]
xstrndup
const char* replace_extension_spec_func(int, const char**)
void* GTM::xcalloc(size_t, bool)
_Tp* 
__gnu_cxx::__detail::__mini_vector<_Tp>::allocate(__gnu_cxx::__detail::__mini_vector<_Tp>::size_type)
 [with _Tp = unsigned int*]
xstrdup
void* GTM::xmalloc(size_t, bool)
void* base_pool_allocator::allocate() [with TBlockAllocator = 
memory_block_pool]
char* ix86_offload_options()
basic_block_def* alloc_block()
xmemdup
char* build_message_string(const char*, ...)
make_relative_prefix
gomp_malloc_cleared
make_relative_prefix_ignore_links
choose_temp_base
make_temp_file
xasprintf
char* file_name_as_prefix(diagnostic_context*, const char*)
void* yyalloc(yy_size_t)
void* ggc_internal_cleared_alloc(size_t, void (*)(void*), size_t, size_t)
void*