Latest Intel platform Granite Rapids has introduced a new instruction - AMX-FP16, which performs dot-products of two FP16 tiles and accumulates the results into a packed single precision tile. AMX-FP16 adds FP16 capability and allows a FP16 GPU trained model to run faster without loss of accuracy or added SW overhead.
The bit definition: CPUID.(EAX=7,ECX=1):EAX[bit 21] Add CPUID definition for AMX-FP16. Signed-off-by: Jiaxi Chen <jiaxi.c...@linux.intel.com> --- target/i386/cpu.c | 2 +- target/i386/cpu.h | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index a61f936eef..cd787b3d97 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -875,7 +875,7 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = { NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, - NULL, NULL, NULL, NULL, + NULL, "amx-fp16", NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, }, diff --git a/target/i386/cpu.h b/target/i386/cpu.h index 3391b99456..d2e3079dfb 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -902,6 +902,8 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w, #define CPUID_7_1_EAX_AVX512_BF16 (1U << 5) /* CMPCCXADD Instructions */ #define CPUID_7_1_EAX_CMPCCXADD (1U << 7) +/* Support Tile Computational Operations on FP16 Numbers */ +#define CPUID_7_1_EAX_AMX_FP16 (1U << 21) /* XFD Extend Feature Disabled */ #define CPUID_D_1_EAX_XFD (1U << 4) -- 2.27.0