>>>>> No, the test is bypassed if XSAVE is 0, not 1. XSAVE being 0 also
>>>>> implies that AVX flag [as well as FMA and XOP] is 0, which is why is
>>>>> jumps to 'done' and not 'clear_avx'.
>>>> This assertion is unfortunately not true on RHEL-6 guests on AVX capable
>>>> CPUs in XEN VM.
>> Could you spell it for me? Which flags does guest observe exactly?
>> XSAVE=0 and AVX=1? I.e. XEN cared to mask XSAVE flag, but not AVX? Is
>> bit masking configurable? If not, how come it clears XSAVE, but not AVX
>> (and FMA)? Wouldn't one consider it a bug? I'm not trying to push it
>> away, just understand...
> 
> Because the hypervisor does nothing that forbids the use of AVX per se; 
> it's not working only because XSAVE doesn't.  If Xen implemented XSAVE 
> were implemented, AVX would start working without any need to treat it 
> specially in the CPUID masking code.

If hypervisor justs sets up XCR0 and guest attempts to use AVX, the
result will be deplorable. Well, it OpenSSL context it would actually
work, but only because it doesn't attempt to use 128 most significant
bits of YMM registers [so that XMM context switching in guest suffices].
Performance might be inadequate, but it would work... But formally just
setting up XCR0 is not sufficient for hypervisor, it should either mask
XSAVE (so that guest doesn't attempt to setup XCR0) or maintain
per-guest XCR0.

>>> The only assertion I found is that XSAVE=0 implies OSXSAVE=0 (and
>>> OSXSAVE=1 implies XSAVE=1).
>> But in order to be able to use AVX, you *have to* arrange OSXSAVE=1 (and
>> of course corresponding bit in XCR0) and prerequisite for this is XSAVE
>> being 1. I.e. there shouldn't be CPUs that have AVX, but not XSAVE.
> 
> But it's not in the spec, so it's wrong to assume it.

Specification says that AVX instruction will generate #UD exception if
XCR0 is not set up appropriately.

>>> Also, I believe 13.7 implies that it's wrong to clear SSE feature bits
>>> when XCR0.SSE=0:
>> That's why it's '&jnc (&label("clear_avx"));' now, not "clear_xmm".
> 
> I don't think there is any reason to have clear_xmm,

But you can't deny the *possibility* that there is 32-bit OS that is
aware of XSAVE and explicitly zeros XCR0[2:1].

> just like in x86_64cpuid.pl.

Once again, x86_64 calling convention used by assembler pack in question
requires XMM to be enabled. I.e. you can legitimately deny the above
mentioned possibility.

>> As for XEN, if it in fact masks XSAVE, but not AVX bits, than even
>> check for XSAVE bit should '&jnc (&label("clear_avx"));' instead of
>> "done". As well as that x86_64cpuid.pl should test for XSAVE...
> 
> That would also work, but it's useless because the spec OTOH says that 
> you *can* ignore XSAVE (and anyway XSAVE means nothing: it says the 
> feature is available, but only OSXSAVE says it is actually unusable).

I still fail to see how exactly did it fail for you. Once again, which
flags does guest OS observe exactly? Is guest OS YMM-capable? Does
latest x86cpuid.pl work for you or is it still problem?


______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       openssl-dev@openssl.org
Automated List Manager                           majord...@openssl.org

Reply via email to