I queried a guru here at work and got the following response:
 
It is the responsibility of the OS to indicate to the CPU that it supports SSE2 instructions by setting the OSFXSR bit in CR4.  If they are running a legacy OS that doesn't support SSE2 (meaning that it isn't using the FXSAVE and FXRSTOR instructions during context switches) then the OSFXSR bit won't be set.  When the OSFXSR bit is not set, the SSE2 instructions aren't available; if you try to execute one, you will get an invalid opcode exception - even if you are running on the latest CPUs that support SSE2.
 
Verdon Walker
(801) 861-2633
[EMAIL PROTECTED]
Novell, Inc., the leading provider of information solutions
http://www.novell.com


>>> [EMAIL PROTECTED] 5/21/2004 11:46:20 AM >>>
A work is being done toward adopting IA-32 SSE2 code pathes in OpenSSL:
bn_mul_add_words and SHA-512 at present. Whether or not a path is taken
is decided at run-time. Formally the decision is be taken upon two
factors: CPU capability and kernel support for SSE extentions. As it
doesn't appear feasable to detect the latter in a way we're ready to
support on multiple platforms, we choose to lift this responsibility to
end user. The user will have the option to either set an environment
variable prior starting the application or recompile the toolkit without
SSE2 support.

However! Above is not the actual matter of this query to the developer
community:-) While re-examining the IA-32 instruction reference I ran
into following question. What does kernel support mean exactly? Kernel
is expected to set a flag in privileged control register to denote its
intention to preserve XMM register bank content upon process context
switch. And *now* the question itself. What about SSE2 intructions
issued with MM register as argument? MM registers are aliased to FP
bank, which is preserved upon context switch even by elder kernels.
Would SSE2 instruction cause invalid opcode exception under legacy
kernel even if issued with MM register as argument? The catch is that at
least bn_mul_add_words uses exclusively MM register bank, in which case
check for CPU capability alone would suffice. Does anybody know answer
by hard or has a system with some legacy OS in her/his disposal we could
test this on? A.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [EMAIL PROTECTED]
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to