Re: SSE2 speed revisited

2004-08-05 Thread Gisle Vanem
Andy Polyakov wrote: Latest relevant update is more picky and requires yet another assembler module compiled and linked in. You mean x86cpuid.o? Got that. I'd rather discuss proper and complete support for assembler modules in DJGPP than trying to figure out what went wrong with your

Re: SSE2 speed revisited

2004-08-04 Thread Andy Polyakov
From snapshot in May with SSE2 enabled: type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes sha-512 3171.75k12757.93k22761.88k34514.56k40059.42k But now it's back to non-SSE2 speed: type 16 bytes 64 bytes256 bytes 1024

SSE2 speed revisited

2004-07-31 Thread Gisle Vanem
From snapshot in May with SSE2 enabled: type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes sha-512 3171.75k12757.93k22761.88k34514.56k40059.42k But now it's back to non-SSE2 speed: type 16 bytes 64 bytes256 bytes

Re: SSE2 speed

2004-06-03 Thread Andy Polyakov
As far as I can see stock OpenSSL doesn't generate assembler moduler for DJGPP, so you've got to tell more details about how do you generate assembler modules. Note that picmeup is used in des assembler modules. Can you figure out how it works there? A. I tweaked the djgpp makefiles to

Re: SSE2 speed

2004-06-02 Thread Andy Polyakov
There's some more problems with OPENSSL_ia32cap. In crypto\bn\asm\bn.586.pl it says: if ($sse2) { picmeup(eax,OPENSSL_ia32cap); but on my target (gcc/djgpp) that should actually be _OPENSSL_ia32cap. I tried with prefixing with $under, but didn't work since I know next to nothing

Re: SSE2 speed

2004-06-02 Thread Gisle Vanem
Andy Polyakov [EMAIL PROTECTED] said: As far as I can see stock OpenSSL doesn't generate assembler moduler for DJGPP, so you've got to tell more details about how do you generate assembler modules. Note that picmeup is used in des assembler modules. Can you figure out how it works there?

Re: SSE2 speed

2004-05-23 Thread Andy Polyakov
openssl speed sha-512: type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes With SSE2 disabled: sha-512 1050.62k 4223.53k 6141.97k 8488.01k 9480.48k with SSE2 enabled: sha-512 3171.75k12757.93k22761.88k34514.56k40059.42k

Re: SSE2 speed

2004-05-23 Thread Gisle Vanem
Andy Polyakov [EMAIL PROTECTED] said: 400% on large blocks. 4x? What gcc version? 3x mentioned in commentary section is also for largest block and with gcc 2.95.3. Well, not that 4x is worse result... I used gcc 3.3.1 with -O2 -fno-strength-reduce -fomit-frame-pointer. As for

Re: SSE2 speed

2004-05-23 Thread Gisle Vanem
I got that part. But AFAICS, when strtol(env,NULL,0) is used to set OPENSSL_ia32cap and env = 0x0400, strtol() treats the value as octal. From mn strtol: The string may begin with an arbitrary amount of white space (as determined by isspace(3)) followed by a single optional + or -

Re: SSE2 speed

2004-05-23 Thread Andy Polyakov
400% on large blocks. 4x? What gcc version? 3x mentioned in commentary section is also for largest block and with gcc 2.95.3. Well, not that 4x is worse result... I used gcc 3.3.1 with -O2 -fno-strength-reduce -fomit-frame-pointer. Oh! I also get worse performance with 3.3.2, ~13 vs. 17MBps on

SSE2 speed

2004-05-22 Thread Gisle Vanem
With SSE2 disabled: openssl speed sha-512: ... type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes sha-512 1050.62k 4223.53k 6141.97k 8488.01k 9480.48k with SSE2 enabled: type 16 bytes 64 bytes256 bytes 1024 bytes 8192