On Wed, Oct 21, 2020 at 5:21 AM Gary Oblock <g...@amperecomputing.com> wrote:
>
> >IPA transforms happens when get_body is called.  With LTO this also
> >trigger reading the body from disk.  So if you want to see all bodies
> >and work on them, you can simply call get_body on everything but it will
> >result in increased memory use since everything will be loaded form disk
> >and expanded (by inlining) at once instead of doing it on per-function
> >basis.
> Jan,
>
> Doing
>
> FOR_EACH_FUNCTION_WITH_GIMPLE_BODY ( node) node->get_body ();
>
> instead of
>
> FOR_EACH_FUNCTION_WITH_GIMPLE_BODY ( node) node->get_untransformed_body ();
>
> instantaneously breaks everything...

I think during WPA you cannot do ->get_body (), only
->get_untransformed_body ().  But
we don't know yet where in the IPA process you're experiencing the issue.

Richard.

> Am I missing something?
>
> Gary
> ________________________________
> From: Jan Hubicka <hubi...@ucw.cz>
> Sent: Tuesday, October 20, 2020 4:34 AM
> To: Richard Biener <richard.guent...@gmail.com>
> Cc: GCC Development <gcc@gcc.gnu.org>; Gary Oblock <g...@amperecomputing.com>
> Subject: Re: Where did my function go?
>
> [EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
> be mindful of safe email handling and proprietary information protection 
> practices.]
>
>
> > > On Tue, Oct 20, 2020 at 1:02 PM Martin Jambor <mjam...@suse.cz> wrote:
> > > >
> > > > Hi,
> > > >
> > > > On Tue, Oct 20 2020, Richard Biener wrote:
> > > > > On Mon, Oct 19, 2020 at 7:52 PM Gary Oblock 
> > > > > <g...@amperecomputing.com> wrote:
> > > > >>
> > > > >> Richard,
> > > > >>
> > > > >> I guess that will work for me. However, since it
> > > > >> was decided to remove an identical function,
> > > > >> why weren't the calls to it adjusted to reflect it?
> > > > >> If the call wasn't transformed that means it will
> > > > >> be mapped at some later time. Is that mapping
> > > > >> available to look at? Because using that would
> > > > >> also be a potential solution (assuming call
> > > > >> graph information exists for the deleted function.)
> > > > >
> > > > > I'm not sure how the transitional cgraph looks like
> > > > > during WPA analysis (which is what we're talking about?),
> > > > > but definitely the IL is unmodified in that state.
> > > > >
> > > > > Maybe Martin has an idea.
> > > > >
> > > >
> > > > Exactly, the cgraph_edges is where the correct call information is
> > > > stored until the inlining transformation phase calls
> > > > cgraph_edge::redirect_call_stmt_to_callee is called on it - inlining is
> > > > a special pass in this regard that performs this IPA-infrastructure
> > > > function in addition to actual inlining.
> > > >
> > > > In cgraph means the callee itself but also information in
> > > > e->callee->clone.param_adjustments which might be interesting for any
> > > > struct-reorg-like optimizations (...and in future possibly in other
> > > > transformation summaries).
> > > >
> > > > The late IPA passes are in very unfortunate spot here since they run
> > > > before the real-IPA transformation phases but after unreachable node
> > > > removals and after clone materializations and so can see some but not
> > > > all of the changes performed by real IPA passes.  The reason for that is
> > > > good cache locality when late IPA passes are either not run at all or
> > > > only look at small portion of the compilation unit.  In such case IPA
> > > > transformations of a function are followed by all the late passes
> > > > working on the same function.
> > > >
> > > > Late IPA passes are unfortunately second class citizens and I would
> > > > strongly recommend not to use them since they do not fit into our
> > > > otherwise robust IPA framework very well.  We could probably provide a
> > > > mechanism that would allow late IPA passes to run all normal IPA
> > > > transformations on a function so they could clearly see what they are
> > > > looking at, but extensive use would slow compilation down so its use
> > > > would be frowned upon at the very least.
> > >
> > > So IPA PTA does get_body () on the nodes it wants to analyze and I
> > > thought that triggers any pending IPA transforms?
> >
> > Yes, it does (and get_untransormed_body does not)
> And to bit correct Maritn's explanation: the late IPA passes are
> intended to work, though I was mostly planning them for prototyping true
> ipa passes and also possibly for implementing passes that inspect only
> few functions.
>
> IPA transforms happens when get_body is called.  With LTO this also
> trigger reading the body from disk.  So if you want to see all bodies
> and work on them, you can simply call get_body on everything but it will
> result in increased memory use since everything will be loaded form disk
> and expanded (by inlining) at once instead of doing it on per-function
> basis.
>
> get_body is simply mean to arrange the body on demand.  The passmanager
> uses it before late passes are executed and ipa-pta uses it before it
> builds constraints (that is not good for reasons described above).
>
> Clone materialization is also triggered by get_body. The clone
> materialization pass mostly happens to remove unreachable function
> bodies. I plan to get rid of it, since as we are now better on doing ipa
> transforms it brings in a lot of bodies already. For cc1plus it is well
> over 1GB of memory.
>
> Honza
> >
> > Honza
> > >
> > > Richard.
> > >
> > > > Martin
> > > >

Reply via email to