zahiraam added inline comments.
================ Comment at: clang/lib/CodeGen/CGExprScalar.cpp:1315 + if ((SrcType->isHalfType() || iSFloat16Allowed) && + !CGF.getContext().getLangOpts().NativeHalfType) { // Cast to FP using the intrinsic if the half type itself isn't supported. ---------------- pengfei wrote: > rjmccall wrote: > > pengfei wrote: > > > rjmccall wrote: > > > > pengfei wrote: > > > > > rjmccall wrote: > > > > > > Okay, this condition is pretty ridiculous to be repeating in three > > > > > > different places across the compiler. Especially since you're > > > > > > going to change it when you implement the new option, right? > > > > > > > > > > > > Can we state this condition more generally? I'm not sure why this > > > > > > is so narrowly restricted, and the variable name isn't telling me > > > > > > anything, since `_Float16` must by definition be "allowed" if we > > > > > > have an expression of `_Float16` type. > > > > > > since _Float16 must by definition be "allowed" if we have an > > > > > > expression of _Float16 type. > > > > > > > > > > _Float16 is allowed only on a few targets. > > > > > https://clang.llvm.org/docs/LanguageExtensions.html#half-precision-floating-point > > > > > By the way, we should update for X86 since it's not limited to > > > > > avx512fp16 now. > > > > > _Float16 is allowed only on a few targets. > > > > > > > > Yes, I know that. But if `SrcType->isFloat16Type()` is true, we must > > > > be on one of those targets, because the type doesn't otherwise exist. > > > I see your point now. The problem here is we want to allow the `_Float16` > > > to be used more broadly. But the target doesn't really support it > > > sometime. Currently full arithmatic operations are supported only on > > > target with AVX512FP16. > > > We should cast for those targets without AVX512FP16 while avoid to do on > > > AVX512FP16. > > I agree that many targets don't natively support arithmetic on this format, > > but x86 is not the first one that does. Unless I'm misunderstanding, we > > already track this property in Clang's TargetInfo as `hasLegalHalfType()`. > > `+avx512fp16` presumably ought to set this. > > > > I'm not sure what the interaction with the `NativeHalfType` LangOpt is > > supposed to be here. My understanding is that that option is just supposed > > to affect `__fp16`, basically turning it into a proper arithmetic type, > > i.e. essentially `_Float16`. Whatever effect you want to apply to > > `_Float16` should presumably happen even if that option not set. > > > > More broadly, I don't think your approach in this patch is correct. The > > type of operations on `_Float16` should not change based on whether the > > target natively supports `_Float16`. If we need to emulate those > > operations on targets that don't provide them natively, we should do that > > at a lower level than the type system. > > > > The most appropriate place to do that is going to depend on the exact > > semantics we want. > > > > If we want to preserve `half` semantics exactly regardless of target, we > > should have Clang's IR generation actually emit `half` operations. Targets > > that don't support those operations natively will have to lower at least > > some of those operations into compiler-rt calls, but that's not at all > > unprecedented. > > > > If we're okay with playing loose for performance reasons, we can promote to > > `float` immediately around individual arithmetic operations. IR generation > > is probably the most appropriate place to do that. But I'm quite concerned > > about that making `_Float16` feel like an unpredictable/unportable type; it > > seems to me that software emulation is much better. > > > > If you're proposing the latter, I think you need to raise that more widely > > than a code review; please make a post on llvm-dev. > > we already track this property in Clang's TargetInfo as `hasLegalHalfType()` > > That sounds a good approch. Thank you. > > > The type of operations on `_Float16` should not change based on whether the > > target natively supports `_Float16`. If we need to emulate those operations > > on targets that don't provide them natively, we should do that at a lower > > level than the type system. > > Unfortunately, we can't do it at low level. The reason is (I'm not expert in > frontend, just recalled from last disscussion with GCC folks) we have to do > expresssion emulation to respect C/C++ semantics. GCC has option > `-fexcess-precision=16` to match the same result with native instructions, > but the default is `-fexcess-precision=fast` according to language semantics. > > > The most appropriate place to do that is going to depend on the exact > > semantics we want... > > Note, we are not simply doing emulation in the frontend. It's backend's > responsibility to emulate a single `half` operation. But it's frontend's > responsibility to choose whether to emit several `half` operations or emit > promote + several `float` operations + truncate. As described in the title, > this patch is doing for the latter. > Okay, this condition is pretty ridiculous to be repeating in three different > places across the compiler. Especially since you're going to change it when > you implement the new option, right? > > Can we state this condition more generally? I'm not sure why this is so > narrowly restricted, and the variable name isn't telling me anything, since > `_Float16` must by definition be "allowed" if we have an expression of > `_Float16` type. I agree :) I wasn't aware of this flag. I am still in the process of figuring out how float16 work and what is exactly required for it. Yes may be having 2 separate patches is the right way to go. One that turns Float16 on for x86 without the use of +avx512fp16 feature and another one that does the emulation ( half to float -> float arithmetic-> truncation to half). @pengfei ? ================ Comment at: clang/lib/CodeGen/CGExprScalar.cpp:1315 + if ((SrcType->isHalfType() || iSFloat16Allowed) && + !CGF.getContext().getLangOpts().NativeHalfType) { // Cast to FP using the intrinsic if the half type itself isn't supported. ---------------- zahiraam wrote: > pengfei wrote: > > rjmccall wrote: > > > pengfei wrote: > > > > rjmccall wrote: > > > > > pengfei wrote: > > > > > > rjmccall wrote: > > > > > > > Okay, this condition is pretty ridiculous to be repeating in > > > > > > > three different places across the compiler. Especially since > > > > > > > you're going to change it when you implement the new option, > > > > > > > right? > > > > > > > > > > > > > > Can we state this condition more generally? I'm not sure why > > > > > > > this is so narrowly restricted, and the variable name isn't > > > > > > > telling me anything, since `_Float16` must by definition be > > > > > > > "allowed" if we have an expression of `_Float16` type. > > > > > > > since _Float16 must by definition be "allowed" if we have an > > > > > > > expression of _Float16 type. > > > > > > > > > > > > _Float16 is allowed only on a few targets. > > > > > > https://clang.llvm.org/docs/LanguageExtensions.html#half-precision-floating-point > > > > > > By the way, we should update for X86 since it's not limited to > > > > > > avx512fp16 now. > > > > > > _Float16 is allowed only on a few targets. > > > > > > > > > > Yes, I know that. But if `SrcType->isFloat16Type()` is true, we must > > > > > be on one of those targets, because the type doesn't otherwise exist. > > > > I see your point now. The problem here is we want to allow the > > > > `_Float16` to be used more broadly. But the target doesn't really > > > > support it sometime. Currently full arithmatic operations are supported > > > > only on target with AVX512FP16. > > > > We should cast for those targets without AVX512FP16 while avoid to do > > > > on AVX512FP16. > > > I agree that many targets don't natively support arithmetic on this > > > format, but x86 is not the first one that does. Unless I'm > > > misunderstanding, we already track this property in Clang's TargetInfo as > > > `hasLegalHalfType()`. `+avx512fp16` presumably ought to set this. > > > > > > I'm not sure what the interaction with the `NativeHalfType` LangOpt is > > > supposed to be here. My understanding is that that option is just > > > supposed to affect `__fp16`, basically turning it into a proper > > > arithmetic type, i.e. essentially `_Float16`. Whatever effect you want > > > to apply to `_Float16` should presumably happen even if that option not > > > set. > > > > > > More broadly, I don't think your approach in this patch is correct. The > > > type of operations on `_Float16` should not change based on whether the > > > target natively supports `_Float16`. If we need to emulate those > > > operations on targets that don't provide them natively, we should do that > > > at a lower level than the type system. > > > > > > The most appropriate place to do that is going to depend on the exact > > > semantics we want. > > > > > > If we want to preserve `half` semantics exactly regardless of target, we > > > should have Clang's IR generation actually emit `half` operations. > > > Targets that don't support those operations natively will have to lower > > > at least some of those operations into compiler-rt calls, but that's not > > > at all unprecedented. > > > > > > If we're okay with playing loose for performance reasons, we can promote > > > to `float` immediately around individual arithmetic operations. IR > > > generation is probably the most appropriate place to do that. But I'm > > > quite concerned about that making `_Float16` feel like an > > > unpredictable/unportable type; it seems to me that software emulation is > > > much better. > > > > > > If you're proposing the latter, I think you need to raise that more > > > widely than a code review; please make a post on llvm-dev. > > > we already track this property in Clang's TargetInfo as > > > `hasLegalHalfType()` > > > > That sounds a good approch. Thank you. > > > > > The type of operations on `_Float16` should not change based on whether > > > the target natively supports `_Float16`. If we need to emulate those > > > operations on targets that don't provide them natively, we should do that > > > at a lower level than the type system. > > > > Unfortunately, we can't do it at low level. The reason is (I'm not expert > > in frontend, just recalled from last disscussion with GCC folks) we have to > > do expresssion emulation to respect C/C++ semantics. GCC has option > > `-fexcess-precision=16` to match the same result with native instructions, > > but the default is `-fexcess-precision=fast` according to language > > semantics. > > > > > The most appropriate place to do that is going to depend on the exact > > > semantics we want... > > > > Note, we are not simply doing emulation in the frontend. It's backend's > > responsibility to emulate a single `half` operation. But it's frontend's > > responsibility to choose whether to emit several `half` operations or emit > > promote + several `float` operations + truncate. As described in the title, > > this patch is doing for the latter. > > Okay, this condition is pretty ridiculous to be repeating in three > > different places across the compiler. Especially since you're going to > > change it when you implement the new option, right? > > > > Can we state this condition more generally? I'm not sure why this is so > > narrowly restricted, and the variable name isn't telling me anything, since > > `_Float16` must by definition be "allowed" if we have an expression of > > `_Float16` type. > > I agree :) I wasn't aware of this flag. I am still in the process of > figuring out how float16 work and what is exactly required for it. > Yes may be having 2 separate patches is the right way to go. One that turns > Float16 on for x86 without the use of +avx512fp16 feature and another one > that does the emulation ( half to float -> float arithmetic-> truncation to > half). @pengfei ? > > we already track this property in Clang's TargetInfo as `hasLegalHalfType()` > > That sounds a good approch. Thank you. > > > The type of operations on `_Float16` should not change based on whether the > > target natively supports `_Float16`. If we need to emulate those operations > > on targets that don't provide them natively, we should do that at a lower > > level than the type system. > > Unfortunately, we can't do it at low level. The reason is (I'm not expert in > frontend, just recalled from last disscussion with GCC folks) we have to do > expresssion emulation to respect C/C++ semantics. GCC has option > `-fexcess-precision=16` to match the same result with native instructions, > but the default is `-fexcess-precision=fast` according to language semantics. > > > The most appropriate place to do that is going to depend on the exact > > semantics we want... > > Note, we are not simply doing emulation in the frontend. It's backend's > responsibility to emulate a single `half` operation. But it's frontend's > responsibility to choose whether to emit several `half` operations or emit > promote + several `float` operations + truncate. As described in the title, > this patch is doing for the latter. The _Float16 type is supported for both C and C++, on x86 systems with SSE2 enabled. From https://gcc.gnu.org/onlinedocs/gcc/Half-Precision.html: "On x86 targets with SSE2 enabled, **without -mavx512fp16, all operations will be emulated by software emulation** and the float instructions. The default behavior for FLT_EVAL_METHOD is to keep the intermediate result of the operation as 32-bit precision. This may lead to inconsistent behavior between software emulation and AVX512-FP16 instructions. Using -fexcess-precision=16 will force round back after each operation. Using -mavx512fp16 will generate AVX512-FP16 instructions instead of software emulation. The default behavior of FLT_EVAL_METHOD is to round after each operation. The same is true with -fexcess-precision=standard and -mfpmath=sse. If there is no -mfpmath=sse, -fexcess-precision=standard alone does the same thing as before, It is useful for code that does not have _Float16 and runs on the x87 FPU." This is what we are trying to reach out with this patch. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113107/new/ https://reviews.llvm.org/D113107 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits