On Wed, Jul 8, 2015 at 1:43 PM, Hal Finkel <[email protected]> wrote:
> ----- Original Message -----
>> From: "Xinliang David Li" <[email protected]>
>> To: "Chandler Carruth" <[email protected]>
>> Cc: [email protected], "<[email protected]> List"
>> <[email protected]>
>> Sent: Wednesday, July 8, 2015 12:25:18 AM
>> Subject: Re: [LLVMdev] Inline hint for methods defined in-class
>>
>> On Tue, Jul 7, 2015 at 6:06 PM, Chandler Carruth
>> <[email protected]> wrote:
>> > On Tue, Jul 7, 2015 at 4:11 PM Easwaran Raman <[email protected]>
>> > wrote:
>> >>
>> >> I'm reviving this thread after a while and CCing cfe-commits as
>> >> suggested by David Blaikie. I've also collected numbers building
>> >> chrome (from chromium, on Linux) with and without this patch as
>> >> suggested by David. I've re-posted the proposed patch and
>> >> performance/size numbers collected at the top to make it easily
>> >> readable for those reading it through cfe-commits.
>> >
>> >
>> > First off, thanks for collecting the numbers and broadening the
>> > distribution. Also, sorry it took me so long to get over to this
>> > thread.
>> >
>> > I want to lay out my stance on this issue at a theoretical and
>> > practical
>> > level first. I'll follow up with thoughts on the numbers as well
>> > after that.
>> >
>> > I think that tying *any* optimizer behavior to the 'inline' keyword
>> > is
>> > fundamentally the wrong direction.
>>
>> Chandler, thanks for sharing your thought -- however I don't think it
>> is wrong, let alone 'fundamentally wrong'. Despite all the analysis
>> that can be done, the inliner is in the end heuristic based. In lack
>> of the profile data, when inlining two calls yield the same static
>> benefit and size cost, it is reasonable for the inliner to think the
>> call to the function with inline hint to yield more high
>> dynamic/runtime benefit -- thus it has a higher static size budget to
>> burn.
>>
>> >We have reasons why we have done this
>> > historically, and we can't just do an immediate about face, but we
>> > should be
>> > actively looking for ways to *reduce* the optimizer's reliance on
>> > this
>> > keyword to convey any meaning whatsoever.
>>
>> yes those additional things will be done, but they are really
>> orthogonal.
>>
>> >
>> > The reason I think that is the correct direction is because, for
>> > better or
>> > worse, the 'inline' keyword in C++ is not about optimization, but
>> > about
>> > linkage.
>>
>> It is about both optimization and linkage. In fact the linkage simply
>> serves as an implementation detail. In C++ standard 7.1.2, paragraph
>> 2 says:
>
> The fact that C++ combines, into one keyword, a change in semantics (linkage)
> and an optimization hint is quite unfortunate. I wish it were otherwise.
> However, as it stands, I support this change. The benchmark numbers are
> encouraging, and it replaces an implementation quirk with the underlying
> (unfortunate) language design choice. The implementation quirk is that
> putting the inline keyword on an in-class function definition changes the
> behavior of the optimizer. However, according to the language specification,
> that definition should have implied that keyword. While an implementation is
> certainly free to do arbitrary things with hints, this behavior violates the
> spirit of the language specification. It makes a meaningless use of a
> standardized keyword meaningful, and that's the greater transgression. In
> addition, it does tend to be the case that in-class function definitions are
> small and suitable for inlining.
Well said!
thanks,
David
>
>>
>> "A function declaration (8.3.5, 9.3, 11.3) with an inline specifier
>> declares an inline function. The inline specifier indicates to the
>> implementation that inline substitution of the function body at the
>> point of call is to be preferred to the usual function call
>> mechanism.
>> An implementation is not required to perform this inline substitution
>> at the point of call; however, even if this inline substitution is
>> omitted, the other rules for inline functions defined by 7.1.2 shall
>> still be respected."
>>
>> Developers see those and rely on those to give compiler the hints.
>>
>> Most importantly, paragraph 3 says:
>>
>> "A function defined within a class definition is an inline function.
>> The inline specifier shall not appear on a block scope function
>> declaration.93 If the inline specifier is used in a friend
>> declaration, that declaration shall be a definition or the function
>> shall have previously been declared inline."
>>
>> Here we can see regardless of how optimizer will honor the hint and
>> to
>> what extent, and based on what analysis,
>> it is basically incorrect to drop the attribute on the floor for
>> in-class function definitions. Eswaran's fix is justified with this
>> reason alone. The side effect of changing inliner behavior is
>> irrelevant.
>>
>> > It has a functional impact and can be both necessary or impossible
>> > to use to meet those functional requirements. This in turn leaves
>> > programmers in a lurch if the functional requirements are ever in
>> > tension
>> > with the optimizer requirements.
>>
>> Not sure what you mean. Performance conscious programmers use it all
>> the time.
>>
>> >
>> > We're also working really hard to get more widely deployed
>> > cross-module
>> > optimization strategies, in part to free programmers from the
>> > requirement
>> > that they put all their performance critical code in header files.
>> > That
>> > makes compilation faster, and has lots of benefits to the factoring
>> > and
>> > design of the code itself. We shouldn't then create an incentive to
>> > keep
>> > things in header files so that they pick up a hint to the
>> > optimizer.
>>
>> >
>> > Ultimately, the world will be a better place if we can eventually
>> > move code
>> > away from relying on the hint provided by the 'inline' keyword to
>> > the
>> > optimizer.
>> >
>>
>> While I would like to see that happen some day, I do think it is an
>> independent matter.
>>
>> >
>> > That doesn't mean that the core concept of hinting to the optimizer
>> > that a
>> > particular function is a particularly good candidate for inlining
>> > is without
>> > value.
>>
>> yes.
>>
>> >While I think it is a bad practice that we shouldn't encourage in
>> > code (especially portable code)
>>
>> yes -- there are indeed programmers who use this casually without
>> considering performance.
>>
>> > I can see the desire to at least have *some*
>> > attribute which is nothing more or less than a hint to the
>> > optimizer to
>> > inline harder[1].
>>
>> yes -- there are programmers who use the attribute consciously.
>>
>> > It would help people work around inliner bugs in the short
>> > term, and even help debug inliner-rooted optimization problems.
>>
>> I think it is a good hint to the compiler even in the longer term.
>> With PGO, we should minimize the reliance on the hint though.
>>
>> >Codebases
>> > with strong portability requirements could still (and probably
>> > should)
>> > forbid or tightly control access to this kind of hint. I would want
>> > really
>> > strong documentation about how this attribute *completely voids*
>> > your
>> > performance warranty (if such a thing exists) as from version to
>> > version of
>> > the compiler it may go from a helpful hint to a devastatingly bad
>> > hint.
>>
>> Why? If the compiler becomes smarter and smarter, the inline hint
>> will
>> become more and more irrelevant and eventually has no effect -- why
>> would the performance warranty be voided? If the compiler is not yet
>> smart enough, why would the compiler refuse to take the hint and
>> forbid developer provide the hint?
>>
>> > But
>> > I think I could be persuaded to live with such a hint existing. But
>> > I'm
>> > *really* uncomfortable with it being tied to something that also
>> > impacts
>> > linkage or other semantics of the program.
>>
>> For consistent with standard, we should pass the attribute. Linkage
>> is
>> not affected in anyway.
>>
>> >
>> > [1]: Currently, the only other hint we have available is pretty
>> > terrible as
>> > it *also* has semantic effects: the always_inline attribute.
>> >
>> >
>> >>
>> >> The proposed patch will add InlineHint to methods defined inside a
>> >> class:
>> >>
>> >> --- a/lib/CodeGen/CodeGenFunction.cpp
>> >> +++ b/lib/CodeGen/CodeGenFunction.cpp
>> >> @@ -630,7 +630,7 @@ void CodeGenFunction::StartFunction(GlobalDecl
>> >> GD,
>> >> if (const FunctionDecl *FD = dyn_cast_or_null<FunctionDecl>(D))
>> >> {
>> >> if (!CGM.getCodeGenOpts().NoInline) {
>> >> for (auto RI : FD->redecls())
>> >> - if (RI->isInlineSpecified()) {
>> >> + if (RI->isInlined()) {
>> >> Fn->addFnAttr(llvm::Attribute::InlineHint);
>> >> break;
>> >> }
>> >>
>> >> Here are the performance and size numbers I've collected:
>> >>
>> >>
>> >> - C++ subset of Spec: No performance effects, < 0.1% size increase
>> >> (all size numbers are text sizes returned by 'size')
>> >> - Clang: 0.9% performance improvement (both -O0 and -O2 on a large
>> >> .ii
>> >> file) , 4.1% size increase
>> >
>> >
>> > FWIW, this size increase seems *really* bad. I think that kills
>> > this
>> > approach even in the short term.
>>
>> Re. size and performance trade-off -- 0.9% performance improvement
>> should greatly win the size cost. Besides among all programs see,
>> only
>> clang sees this size increase with all the others seeing negligible
>> size increase.
>
> I agree. A 1% performance increase is worth a 4% code-size increase when not
> optimizing for size.
>
> -Hal
>
>>
>> This is not a short term vs long term situation. It is basically a
>> bug
>> fix that FE drops the attribute. If it exposes inliner heuristic bug,
>> that should be fixed/tuned separately. With the hint correctly
>> passed
>> in, Easwaran will do further tuning including time based analysis.
>>
>> >
>> >>
>> >> - Chrome: no performance improvement, 0.24% size increase
>> >> - Google internal benchmark suite (geomean of ~20 benchmarks):
>> >> ~1.8%
>> >> performance improvement, no size regression
>> >
>> >
>> > I'm also somewhat worried about the lack of any performance
>> > improvements
>> > outside of the Google benchmarks. That somewhat strongly suggests
>> > that our
>> > benchmarks are overly coupled to this hint already. The fact that
>> > neither
>> > Chrome, Clang, nor SPEC improved is... not at all encouraging.
>>
>> Other than Google benchmarks, we do see Clang improve performance.
>> Besides, current inliner needs to be further tuned in order to get
>> more performance benefit. Passing the hint through is simply an
>> enabler. Also remember that most of SPEC benchmarks are C programs.
>> C++ programs with heavy use of virtual functions may not benefit a
>> lot
>> either.
>>
>> David
>>
>>
>> >
>> > -Chandler
>> _______________________________________________
>> cfe-commits mailing list
>> [email protected]
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits