On Thu, Jul 29, 2021 at 5:57 AM Joseph Myers <[email protected]> wrote: > > On Wed, 21 Jul 2021, liuhongt via Gcc-patches wrote: > > > @@ -23254,13 +23337,15 @@ ix86_get_excess_precision (enum > > excess_precision_type type) > > provide would be identical were it not for the unpredictable > > cases. */ > > if (!TARGET_80387) > > - return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT; > > + return TARGET_SSE2 > > + ? FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 > > + : FLT_EVAL_METHOD_PROMOTE_TO_FLOAT; > > else if (!TARGET_MIX_SSE_I387) > > { > > if (!(TARGET_SSE && TARGET_SSE_MATH)) > > return FLT_EVAL_METHOD_PROMOTE_TO_LONG_DOUBLE; > > else if (TARGET_SSE2) > > - return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT; > > + return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16; > > } > > > > /* If we are in standards compliant mode, but we know we will > > This patch is not changing the default "fast" mode at all; that's > promoting to float, unconditionally. But you have a subsequent change > there in patch 4 to make the promotions in the default "fast" mode depend > on hardware support for the new instructions; it's unhelpful for the > documentation not to corresponding exactly to the code changes in the same > patch. Yes, will change. > > Rather than using FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 whenever TARGET_SSE2 > (i.e. whenever the type is available), it might make more sense to follow > AArch64 and use it only when the hardware instructions are available. In > any case, it seems peculiar to use a different threshold in the "fast" We want to provide some debuggability to the software emulation. When there's inconsistency between software emulation and hardware instructions, users can still debug on non-avx512fp16 processor w/ software emulation and extra option -fexcess-precision=standard, Also since TARGET_C_EXCESS_PRECISION is not related to type, for testcase w/o _Float16 and is supposed to be runned on x86 fpu, if gcc is built w/ --with-arch=sapphirerapid, it will regress those testcases. .i.e. gcc.target/i386/excess-precision-*.c, that's why we can't follow AArch64. > case from the "standard" case. -fexcess-precision=standard is not "avoid > excess precision", it's "implement excess precision in the front end". > Whenever "fast" is implementing excess precision in the front end, > "standard" should be doing the same thing as "fast". > > > +Soft-fp keeps the intermediate result of the operation at 32-bit precision > > by defaults, > > +which may lead to inconsistent behavior between soft-fp and avx512fp16 > > instructions, > > +using @option{-fexcess-precision=standard} will force round back after > > every operation. > > "soft-fp" is, as the name of some code within GCC, an internal > implementation detail, which should not be referenced in the user manual. > What results in intermediate results being in a wider precision is not > soft-fp; it's promotions inserted by the front end as a result of how the > above hook is defined (promotions inserted by the optabs/expand code are > an implementation detail that should always be followed automatically by a > truncation of the result and so not be user-visible). Yes, will reorganize the words. > > As far as I know, the official name of "avx512fp16" is "AVX512-FP16" and > text in the manual should use the official capitalization, hyphenation > etc. in such names unless literally referring to command-line options > inside @option or similar. Yes, will change. > > -- > Joseph S. Myers > [email protected]
-- BR, Hongtao
