nickdesaulniers added a comment.

In D74094#4648319 <https://reviews.llvm.org/D74094#4648319>, @rjmccall wrote:

>> That's better than the status quo (no lifetime markers at all) but still 
>> suboptimal, and I don't think it will solve the particular case I care about 
>> (a particular Linux kernel driver written in C which is passing many 
>> aggregates by value).
>
> Ah, all in the same full-expression?  Then yeah, my idea wouldn't be good 
> enough to optimize your situation.

Yeah, here's an example:
https://github.com/torvalds/linux/blob/2cf0f715623872823a72e451243bbf555d10d032/drivers/gpu/drm/amd/display/dc/dml/calcs/dce_calcs.c#L564-L570
These nested function calls take a struct by value and return the same type. 
The struct contains a single `unsigned long long`.  When targeting 32b ARM, 
we're kind of hamstrung (I think) by 
https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#result-return;
 clang will "unwrap" this aggregate for other targets, but not 32b ARM, I think 
because of that final bullet point about composite types.  Though, reading the 
next section 
https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#parameter-passing,
 I suspect clang may also be missing out the logic referring to "NCRN."  
Perhaps LLVM performs that later, but by not unwrapping the parameters, the 
omission of lifetime markers is painful.

I'm certain I could rewrite this code to not use a 1-element aggregate, but I 
think this is a common case of discrepancies where GCC does much better than 
clang in terms of stack usage. D74094 <https://reviews.llvm.org/D74094> causes 
a 5x reduction in stack usage for this function in particular for 32b arm 
targets, so I'd rather target the root issue here which I think is clang.

>> Do I want to create a flag for that? Not particularly; it does mean 
>> supporting BOTH codegen patterns; I'd rather have a flag to control the 
>> optimization or not do it at all rather than support 2 different 
>> optimizations.
>
> The received wisdom here is that it's very helpful to have a flag because it 
> lets users test the effect of different changes independently.  Users mostly 
> don't live on trunk, and LLVM releases are roughly every six months, so when 
> users get a new compiler it's going to have at least six months of 
> accumulated changes in it.  If a project breaks with a new compiler, it's 
> really great to be able to isolate which change is causing the problem 
> without having to rebuild the compiler.
>
> The fact that we have dynamic checking of this from ASan does help a lot, 
> though.

Ok, I will add that, and a release note about it and that ASan can help detect 
this as well.

>> One thing I noticed, simply printing the expression when we were about to 
>> add the lifetime cleanup; I noticed some examples would trigger on 
>> CXXConstructExpr, CXXTemporaryObjectExpr, and CXXOperatorCallExpr. Those 
>> feel very C++ specific; I do wonder if we could do something differently for 
>> C, but I think we should not since as @rjmccall said "we'd have a consistent 
>> rule for all types and targets" which IMO is nicer+simpler.
>
> Well, if the language rules are different, they're different.  (More on that 
> in a second.)  As far as I know, the lifetime of a parameter variable in C is 
> never specified as outliving the call, and the entire concept of a 
> full-expression is a C++ism.  We could do something more aggressive in C.

I could only find language in the latest draft about lifetime of a parameter, 
nothing about lifetime of arguments. See footnote 206 in 
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3096.pdf pdf page 178.

>> Converting the above widget example to C, I noticed that we're also missing 
>> the lifetime markers on compound literals!
>
> Yeah, so this is probably because the lifetime of a compound literal is 
> somewhat surprising in C: it's actually the containing block scope.  We could 
> emit lifetime markers, but we'd have to make sure we emitted them in the 
> right place.  In pure C, I think this is only detectable if the user actually 
> takes the address of the literal; unlike C++, C doesn't implicitly expose the 
> address of temporaries or have destructors that allow their lifetime to be 
> directly tested.  But in ObjC, you could have a compound literal with 
> weak/strong references in it, which arguably are required to be destroyed at 
> the close of the enclosing block.

Ack; will track that as a separate TODO.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74094/new/

https://reviews.llvm.org/D74094

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to