pengfei added a comment. In D113107#3744782 <https://reviews.llvm.org/D113107#3744782>, @rjmccall wrote:
> In D113107#3744505 <https://reviews.llvm.org/D113107#3744505>, @pengfei wrote: > >>> I'm not sure what optimization you mean. Because the ABI returns 16-bit and >>> 32-bit FP values differently, there really isn't a way that we can return a >>> value without going through a truncation/extension cycle. >> >> I explained it to Zahira offline. I forgot we have different expectation for >> the patch, thus we were talking different optimization to each other. I >> expected each backend has the ability to lower half operations. So I >> emphasized not to do the promotion or eliminate unnecessary promotion at the >> begining. While I see your point, we can only eliminate or combine >> unpromotion to the following promotion, so that we don't leave half >> operations to backends. >> >>> There's potential to eliminate those with IPO, but we should definitely >>> leave that for a different patch, for two reasons: >> >> I agree with you, IPO is the only chance to do elimination. >> >>> Somehow we've taken a huge step back on unpromotion, and I'm worried you're >>> now doing the exact thing I didn't want us doing and forcing all the >>> downstream clients to handle the possibility of a promoted result. >> >> However, I am still not persuaded we need to consider the backends not >> supporting half operations (if I understand your downstream clients >> correctly). > > Oh, no, LLVM is not what I mean by downstream clients. That comment was > meant for Zahira and is about a purely internal matter to IRGen; the > downstream clients I meant are just the call sites within IRGen of e.g. > EmitScalarExpr. We've been having some kind of persistent disconnect > throughout this review about IRGen's postconditions on expression emission. > Unfortunately, IRGen doesn't have good assertions in place for its > postconditions, so breaking the postconditions can silently turn into > miscompiles instead of assertion failures. For example, Clang's IR emission > for `return` statements has some well-meaning code that tries to handle minor > type mismatches between the ABI-dictated IR return type and the IR type that > expression-emission produces by coercing the value through memory. > Unfortunately, that code is also triggered when expression emission just has > a bug and returns the wrong type, and so you can see a bunch of code in the > tests that's doing stuff like storing a pair of `float`s into memory and then > loading a pair of `half`s out, which is obviously incorrect. None of this > has anything to do with our conversation about avoiding fptrunc/fpext pairs > over return boundaries. Thanks @rjmccall for the explanation. I understand it a bit more now. So my proposal is still valid, though it doesn't solve the problem you described. I don't know any details of the front-end, but I can see the problem in the test case. Assumably we don't do fpext then fptrunc for unary and binary expression, the problem still exists in, e.g., _Float16 foo(_Float16 a, _Float16 b, _Float16 c) { return a + b + c; } Can we always split it into two expressions, e.g., _Float16 t = a + b + c; return t; I think they are constant equivalent in C and C++ semantics. And it may solve the returning problem you descributes. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113107/new/ https://reviews.llvm.org/D113107 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits