pengfei added a comment.

In D113107#3744782 <https://reviews.llvm.org/D113107#3744782>, @rjmccall wrote:

> In D113107#3744505 <https://reviews.llvm.org/D113107#3744505>, @pengfei wrote:
>
>>> I'm not sure what optimization you mean. Because the ABI returns 16-bit and 
>>> 32-bit FP values differently, there really isn't a way that we can return a 
>>> value without going through a truncation/extension cycle.
>>
>> I explained it to Zahira offline. I forgot we have different expectation for 
>> the patch, thus we were talking different optimization to each other. I 
>> expected each backend has the ability to lower half operations. So I 
>> emphasized not to do the promotion or eliminate unnecessary promotion at the 
>> begining. While I see your point, we can only eliminate or combine 
>> unpromotion to the following promotion, so that we don't leave half 
>> operations to backends.
>>
>>> There's potential to eliminate those with IPO, but we should definitely 
>>> leave that for a different patch, for two reasons:
>>
>> I agree with you, IPO is the only chance to do elimination.
>>
>>> Somehow we've taken a huge step back on unpromotion, and I'm worried you're 
>>> now doing the exact thing I didn't want us doing and forcing all the 
>>> downstream clients to handle the possibility of a promoted result.
>>
>> However, I am still not persuaded we need to consider the backends not 
>> supporting half operations (if I understand your downstream clients 
>> correctly).
>
> Oh, no, LLVM is not what I mean by downstream clients.  That comment was 
> meant for Zahira and is about a purely internal matter to IRGen; the 
> downstream clients I meant are just the call sites within IRGen of e.g. 
> EmitScalarExpr.  We've been having some kind of persistent disconnect 
> throughout this review about IRGen's postconditions on expression emission.  
> Unfortunately, IRGen doesn't have good assertions in place for its 
> postconditions, so breaking the postconditions can silently turn into 
> miscompiles instead of assertion failures.  For example, Clang's IR emission 
> for `return` statements has some well-meaning code that tries to handle minor 
> type mismatches between the ABI-dictated IR return type and the IR type that 
> expression-emission produces by coercing the value through memory.  
> Unfortunately, that code is also triggered when expression emission just has 
> a bug and returns the wrong type, and so you can see a bunch of code in the 
> tests that's doing stuff like storing a pair of `float`s into memory and then 
> loading a pair of `half`s out, which is obviously incorrect.  None of this 
> has anything to do with our conversation about avoiding fptrunc/fpext pairs 
> over return boundaries.

Thanks @rjmccall for the explanation. I understand it a bit more now. So my 
proposal is still valid, though it doesn't solve the problem you described.
I don't know any details of the front-end, but I can see the problem in the 
test case. Assumably we don't do fpext then fptrunc for unary and binary 
expression, the problem still exists in, e.g.,

  _Float16 foo(_Float16 a, _Float16 b, _Float16 c) {
    return a + b + c;
  }

Can we always split it into two expressions, e.g.,

  _Float16 t = a + b + c;
  return t;

I think they are constant equivalent in C and C++ semantics. And it may solve 
the returning problem you descributes.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113107/new/

https://reviews.llvm.org/D113107

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to