Hi Honza, Gentle ping https://gcc.gnu.org/pipermail/gcc-patches/2022-July/597891.html
Thanks, Lili. > -----Original Message----- > From: Gcc-patches <gcc-patches-bounces+lili.cui=intel....@gcc.gnu.org> On > Behalf Of Cui, Lili via Gcc-patches > Sent: Sunday, July 10, 2022 10:05 PM > To: Jan Hubicka <hubi...@kam.mff.cuni.cz> > Cc: Lu, Hongjiu <hongjiu...@intel.com>; Liu, Hongtao > <hongtao....@intel.com>; gcc-patches@gcc.gnu.org > Subject: RE: [PATCH] Add a heuristic for eliminate redundant load and store > in inline pass. > > > > -----Original Message----- > > From: Jan Hubicka <hubi...@kam.mff.cuni.cz> This is interesting idea. > > Basically we want to guess if inlining will > > make SRA and or strore->load propagation possible. I think the > > solution using INLINE_HINT may be bit too trigger happy, since it is > > very common that this happens and with -O3 the hints are taken quite > sriously. > > > > We already have mechanism to predict this situaiton by simply > > expeciting that stores to addresses pointed to by function parameter > > will be eliminated by 50%. See eliminated_by_inlining_prob. > > > > I was thinking that we may combine it with a knowledge that the > > parameter points to caller local memory (which is done by llvm's > > heuristics) which can be added to IPA predicates. > > > > The idea of checking that the actual sotre in question is paired with > > load at caller side is bit harder: one needs to invent representation > > for such conditions. So I wonder how much extra help we need for > > critical inlning to happen at imagemagics? > > Hi Honza, > > Really appreciate for the feedback. I found that eliminated_by_inlining_prob > does eliminated the stmt 50% of the time, but the gap is still big. > SRA cannot split callee's parameter for "Do not decompose non-BLKmode > parameters in a way that would create a BLKmode parameter. Especially for > pass-by-reference (hence, pointer type parameters), it's not worth it." > > Critical inline function information > > Caller: GetVirtualPixelsFromNexus > size: 541 > time: 484.08 > e->freq: 0.83 > > Callee: SetPixelCacheNexusPixels > nonspec time: 46.60 > time : 36.18 > size: 87 > > > Since the insns number 87 of callee function is bigger than inline_insns_auto > (30) and there is no hint, so inline depends on "big_speedup_p (e)". 484.08 > (caller_time) * 0.15 (param_inline_min_speedup == 15) = 72.61, which > means callee's time should be at least 72.61, but callee's time is 46.60, so > we > need to lower param_inline_min_speedup to 3 or 4. I checked the > history(https://gcc.gnu.org/bugzilla/show_bug.cgi?format=multiple&id=8366 > 5), that you tried changing it to 8, but that increases the gzip code size by > 2.5KB. so I want to add a heuristic hit for it. > > Thanks, > Lili. > > > > Honza