"Andy Polyakov" <[EMAIL PROTECTED]> said: > >>400% on large blocks. > > 4x? What gcc version? 3x mentioned in commentary section is also for > largest block and with gcc 2.95.3. Well, not that 4x is worse result...
I used gcc 3.3.1 with -O2 -fno-strength-reduce -fomit-frame-pointer. > As for OPENSSL_ia32cap. First of all, it's work in progress, it's not > final yet. But the current plan for it is following. Even though it will > be possible to manipulate the variable in question programmatically from > application, we will *not* recommend it. Instead it will be initialized > upon call to OPENSSL_add_all_algorithms to the value returned in EDX > register by CPUID instruction (that's why the value is 1<<26). I got that part. But AFAICS, when strtol(env,NULL,0) is used to set OPENSSL_ia32cap and env = "0x04000000", strtol() treats the value as octal. From mn strtol: The string may begin with an arbitrary amount of white space (as determined by isspace(3)) followed by a single optional + or - sign. If base is zero or 16, the string may then include a 0x prefix, and the number will be read in base 16; otherwise, a zero base is taken as 10 (decimal) unless the *next character is 0*, in which case it is taken as 8 (octal). > starting application [or recompile without SSE2 support]. So that *no* > application source code modifications will ever be required to engage or > disengage SSE2 code. I would suggest some API like "int OPENSSL_enable_sse2(int)". And btw, shouldn't that be "unsigned long OPENSSL_ia32cap" ? > > On djgpp where I tested this, we are free to use whatever CPU > > instructions that's supported. Only trouble is getting at the CR4 register. > > As long as you run DJGPP application under OS such as XP you won't be > able to get to CR4, right? But what happens if you run it under real > MS-DOS? Well, not that we should rush and implement SSE kernel support > for MS-DOS, I'm simply curious:-) It depends on the DPMI host. I haven't tried it yet, but only CWSDPMI running at at ring-0 (on a "GenuineIntel") would allow it. Otherwise doing a "mov eax, cr4" would cause an exception. Not sure what an "AuthenticAMD" would do. Under Windows' NTVDM, it's virtualised and always returns 0 :-( --gv ______________________________________________________________________ OpenSSL Project http://www.openssl.org Development Mailing List [EMAIL PROTECTED] Automated List Manager [EMAIL PROTECTED]
