> Here is analysis by Paolo Bonzini:
> 
> I compared crypto/x86_64cpuid.pl and crypto/x86cpuid.pl, and the code in the
> latter is wrong.
> 
>>From x86_64cpuid.pl:
> 
>         mov     %edx,%r10d              # %r9d:%r10d is copy of %ecx:%edx
>         bt      \$27,%r9d               # check OSXSAVE bit
>         jnc     .Lclear_avx
>         xor     %ecx,%ecx               # XCR0
>         .byte   0x0f,0x01,0xd0          # xgetbv
>         and     \$6,%eax                # isolate XMM and YMM state support
>         cmp     \$6,%eax
>         je      .Ldone
> .Lclear_avx:
>         mov     \$0xefffe7ff,%eax       # ~(1<<28|1<<12|1<<11)
>         and     %eax,%r9d               # clear AVX, FMA and AMD XOP bits
> .Ldone:
> 
> 
>>From x86cpuid.pl:
> 
>         &bt     ("ecx",26);             # check XSAVE bit
>         &jnc    (&label("done"));
>         &bt     ("ecx",27);             # check OSXSAVE bit
>         &jnc    (&label("clear_xmm"));
>         &xor    ("ecx","ecx");
>         &data_byte(0x0f,0x01,0xd0);     # xgetbv
>         &and    ("eax",6);
>         &cmp    ("eax",6);
>         &je     (&label("done"));
>         &cmp    ("eax",2);
>         &je     (&label("clear_avx"));
> &set_label("clear_xmm");
>         &and    ("ebp",0xfdfffffd);     # clear AESNI and PCLMULQDQ bits
>         &and    ("esi",0xfeffffff);     # clear FXSR
> &set_label("clear_avx");
>         &and    ("ebp",0xefffe7ff);     # clear AVX, FMA and AMD XOP bits
> &set_label("done");
> 
> 
> x86_64cpuid.pl is not completely correct; if bit 1 of EAX was zero (XMM 
> support
> not enabled in the OS) you would need to clear AESNI and PCLMULQDQ bits as 
> done
> in x86cpuid.pl.

Rationale behind not paying attention to bit 1 of EAX (and manipulating
AESNI and PCLMULQDQ bits) is that all OSes supported by assembler pack
in question *require* XMM support to be enabled per ABI specification.

> However, in practice does not matter because any OS new enough
> to set OSXSAVE will always enable XMM support as well.

As just implied it has more to do ABI than how new particular kernel is.
If there was x86_64 OS that declares XMM support optional, then it would
be different story.

> x86cpuid.pl instead is completely broken:
> 
> - the whole test is bypassed if XSAVE=1, which makes absolutely no sense. 
> x86_64cpuid.pl is right in testing OSXSAVE

No, the test is bypassed if XSAVE is 0, not 1. XSAVE being 0 also
implies that AVX flag [as well as FMA and XOP] is 0, which is why is
jumps to 'done' and not 'clear_avx'.

> - if OSXSAVE=0, all SSE code is disabled,

Not *all*, only newest (AESNI, GF(2^m), SSSE3 code path in SHA1).

> which also makes no sense because any
> OS less than 10 years old lets you use SSE even if it does not set OSXSAVE 
> (via
> FXSAVE), and this includes of course RHEL6.

Admittedly this is correct assertion, i.e. OSes that enable XMM support
the "old way" shouldn't be punished by clearing AESNI, PCLMULQDQ and
FXSR bits in test in question. http://cvs.openssl.org/chngview?cn=21640
is the way to do it.


______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       openssl-dev@openssl.org
Automated List Manager                           majord...@openssl.org

Reply via email to