Re: [PATCH v1 1/3] x86/cpufeatures: Add low performance CRC32C instruction CPU feature

2021-01-14 Thread Tony W Wang-oc
On 11/01/2021 19:03, Borislav Petkov wrote:
> On Mon, Jan 11, 2021 at 06:51:59PM +0800, Tony W Wang-oc wrote:
>> This issue will be enhanced by hardware and patch submit will be pending.
> 
> I have no clue what that has to do with your current patch... you might
> need to explain more verbosely.
> 

After internal research, decided to fix the low performance crc32c
instruction issue on these Zhaoxin CPUs by microcode. So, do not need
this patch anymore.

Sincerely
Tonyw


Re: [PATCH v1 1/3] x86/cpufeatures: Add low performance CRC32C instruction CPU feature

2021-01-11 Thread hpa
On January 6, 2021 10:37:50 PM PST, Borislav Petkov  wrote:
>On Thu, Jan 07, 2021 at 02:19:06PM +0800, Tony W Wang-oc wrote:
>> SSE4.2 on Zhaoxin CPUs are compatible with Intel. The presence of
>> CRC32C instruction is enumerated by CPUID.01H:ECX.SSE4_2[bit 20] = 1.
>> Some Zhaoxin CPUs declare support SSE4.2 instruction sets but their
>> CRC32C instruction are working with low performance.
>> 
>> Add a synthetic CPU flag to indicates that the CRC32C instruction is
>> not working as intended. This low performance CRC32C instruction flag
>> is depend on X86_FEATURE_XMM4_2.
>> 
>> Signed-off-by: Tony W Wang-oc 
>> ---
>>  arch/x86/include/asm/cpufeatures.h | 1 +
>>  arch/x86/kernel/cpu/cpuid-deps.c   | 1 +
>>  2 files changed, 2 insertions(+)
>> 
>> diff --git a/arch/x86/include/asm/cpufeatures.h
>b/arch/x86/include/asm/cpufeatures.h
>> index 84b8878..9e8151b 100644
>> --- a/arch/x86/include/asm/cpufeatures.h
>> +++ b/arch/x86/include/asm/cpufeatures.h
>> @@ -292,6 +292,7 @@
>>  #define X86_FEATURE_FENCE_SWAPGS_KERNEL (11*32+ 5) /* "" LFENCE in
>kernel entry SWAPGS path */
>>  #define X86_FEATURE_SPLIT_LOCK_DETECT   (11*32+ 6) /* #AC for split
>lock */
>>  #define X86_FEATURE_PER_THREAD_MBA  (11*32+ 7) /* "" Per-thread
>Memory Bandwidth Allocation */
>> +#define X86_FEATURE_CRC32C  (11*32+ 8) /* "" Low performance CRC32C
>instruction */
>
>Didn't hpa say to create a BUG flag for it - X86_BUG...? Low
>performance
>insn sounds like a bug and not a feature to me.
>
>And call it X86_BUG_CRC32C_SLOW or ..._UNUSABLE to denote what it
>means.
>
>Thx.

Yes, it should be a BUG due to the inclusion properties of BUG v FEATURE.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: [PATCH v1 1/3] x86/cpufeatures: Add low performance CRC32C instruction CPU feature

2021-01-11 Thread Borislav Petkov
On Mon, Jan 11, 2021 at 06:51:59PM +0800, Tony W Wang-oc wrote:
> This issue will be enhanced by hardware and patch submit will be pending.

I have no clue what that has to do with your current patch... you might
need to explain more verbosely.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


Re: [PATCH v1 1/3] x86/cpufeatures: Add low performance CRC32C instruction CPU feature

2021-01-11 Thread Tony W Wang-oc


On 07/01/2021 14:37, Borislav Petkov wrote:
> On Thu, Jan 07, 2021 at 02:19:06PM +0800, Tony W Wang-oc wrote:
>> SSE4.2 on Zhaoxin CPUs are compatible with Intel. The presence of
>> CRC32C instruction is enumerated by CPUID.01H:ECX.SSE4_2[bit 20] = 1.
>> Some Zhaoxin CPUs declare support SSE4.2 instruction sets but their
>> CRC32C instruction are working with low performance.
>>
>> Add a synthetic CPU flag to indicates that the CRC32C instruction is
>> not working as intended. This low performance CRC32C instruction flag
>> is depend on X86_FEATURE_XMM4_2.
>>
>> Signed-off-by: Tony W Wang-oc 
>> ---
>>  arch/x86/include/asm/cpufeatures.h | 1 +
>>  arch/x86/kernel/cpu/cpuid-deps.c   | 1 +
>>  2 files changed, 2 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/cpufeatures.h 
>> b/arch/x86/include/asm/cpufeatures.h
>> index 84b8878..9e8151b 100644
>> --- a/arch/x86/include/asm/cpufeatures.h
>> +++ b/arch/x86/include/asm/cpufeatures.h
>> @@ -292,6 +292,7 @@
>>  #define X86_FEATURE_FENCE_SWAPGS_KERNEL (11*32+ 5) /* "" LFENCE in 
>> kernel entry SWAPGS path */
>>  #define X86_FEATURE_SPLIT_LOCK_DETECT   (11*32+ 6) /* #AC for split 
>> lock */
>>  #define X86_FEATURE_PER_THREAD_MBA  (11*32+ 7) /* "" Per-thread Memory 
>> Bandwidth Allocation */
>> +#define X86_FEATURE_CRC32C  (11*32+ 8) /* "" Low performance CRC32C 
>> instruction */
> 
> Didn't hpa say to create a BUG flag for it - X86_BUG...? Low performance
> insn sounds like a bug and not a feature to me.
> 
> And call it X86_BUG_CRC32C_SLOW or ..._UNUSABLE to denote what it means.
> 

This issue will be enhanced by hardware and patch submit will be pending.

Sincerely
Tonyw



Re: [PATCH v1 1/3] x86/cpufeatures: Add low performance CRC32C instruction CPU feature

2021-01-06 Thread Borislav Petkov
On Thu, Jan 07, 2021 at 02:19:06PM +0800, Tony W Wang-oc wrote:
> SSE4.2 on Zhaoxin CPUs are compatible with Intel. The presence of
> CRC32C instruction is enumerated by CPUID.01H:ECX.SSE4_2[bit 20] = 1.
> Some Zhaoxin CPUs declare support SSE4.2 instruction sets but their
> CRC32C instruction are working with low performance.
> 
> Add a synthetic CPU flag to indicates that the CRC32C instruction is
> not working as intended. This low performance CRC32C instruction flag
> is depend on X86_FEATURE_XMM4_2.
> 
> Signed-off-by: Tony W Wang-oc 
> ---
>  arch/x86/include/asm/cpufeatures.h | 1 +
>  arch/x86/kernel/cpu/cpuid-deps.c   | 1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/arch/x86/include/asm/cpufeatures.h 
> b/arch/x86/include/asm/cpufeatures.h
> index 84b8878..9e8151b 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -292,6 +292,7 @@
>  #define X86_FEATURE_FENCE_SWAPGS_KERNEL  (11*32+ 5) /* "" LFENCE in 
> kernel entry SWAPGS path */
>  #define X86_FEATURE_SPLIT_LOCK_DETECT(11*32+ 6) /* #AC for split 
> lock */
>  #define X86_FEATURE_PER_THREAD_MBA   (11*32+ 7) /* "" Per-thread Memory 
> Bandwidth Allocation */
> +#define X86_FEATURE_CRC32C   (11*32+ 8) /* "" Low performance CRC32C 
> instruction */

Didn't hpa say to create a BUG flag for it - X86_BUG...? Low performance
insn sounds like a bug and not a feature to me.

And call it X86_BUG_CRC32C_SLOW or ..._UNUSABLE to denote what it means.

Thx.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


[PATCH v1 1/3] x86/cpufeatures: Add low performance CRC32C instruction CPU feature

2021-01-06 Thread Tony W Wang-oc
SSE4.2 on Zhaoxin CPUs are compatible with Intel. The presence of
CRC32C instruction is enumerated by CPUID.01H:ECX.SSE4_2[bit 20] = 1.
Some Zhaoxin CPUs declare support SSE4.2 instruction sets but their
CRC32C instruction are working with low performance.

Add a synthetic CPU flag to indicates that the CRC32C instruction is
not working as intended. This low performance CRC32C instruction flag
is depend on X86_FEATURE_XMM4_2.

Signed-off-by: Tony W Wang-oc 
---
 arch/x86/include/asm/cpufeatures.h | 1 +
 arch/x86/kernel/cpu/cpuid-deps.c   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h 
b/arch/x86/include/asm/cpufeatures.h
index 84b8878..9e8151b 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -292,6 +292,7 @@
 #define X86_FEATURE_FENCE_SWAPGS_KERNEL(11*32+ 5) /* "" LFENCE in 
kernel entry SWAPGS path */
 #define X86_FEATURE_SPLIT_LOCK_DETECT  (11*32+ 6) /* #AC for split lock */
 #define X86_FEATURE_PER_THREAD_MBA (11*32+ 7) /* "" Per-thread Memory 
Bandwidth Allocation */
+#define X86_FEATURE_CRC32C (11*32+ 8) /* "" Low performance CRC32C 
instruction */
 
 /* Intel-defined CPU features, CPUID level 0x0007:1 (EAX), word 12 */
 #define X86_FEATURE_AVX512_BF16(12*32+ 5) /* AVX512 BFLOAT16 
instructions */
diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index 42af31b6..7d7fca7 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -72,6 +72,7 @@ static const struct cpuid_dep cpuid_deps[] = {
{ X86_FEATURE_AVX512_FP16,  X86_FEATURE_AVX512BW  },
{ X86_FEATURE_ENQCMD,   X86_FEATURE_XSAVES},
{ X86_FEATURE_PER_THREAD_MBA,   X86_FEATURE_MBA   },
+   { X86_FEATURE_CRC32C,   X86_FEATURE_XMM4_2},
{}
 };
 
-- 
2.7.4