Andy Polyakov wrote:
Latest relevant update is more picky and requires yet another assembler
module compiled and linked in.
You mean x86cpuid.o? Got that.
I'd rather discuss proper and complete support for assembler modules in
DJGPP than trying to figure out what went wrong with your
From snapshot in May with SSE2 enabled:
type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes
sha-512 3171.75k12757.93k22761.88k34514.56k40059.42k
But now it's back to non-SSE2 speed:
type 16 bytes 64 bytes256 bytes 1024
From snapshot in May with SSE2 enabled:
type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes
sha-512 3171.75k12757.93k22761.88k34514.56k40059.42k
But now it's back to non-SSE2 speed:
type 16 bytes 64 bytes256 bytes
As far as I can see stock OpenSSL doesn't generate assembler moduler
for DJGPP, so you've got to tell more details about how do you generate
assembler modules. Note that picmeup is used in des assembler modules.
Can you figure out how it works there? A.
I tweaked the djgpp makefiles to
There's some more problems with OPENSSL_ia32cap.
In crypto\bn\asm\bn.586.pl it says:
if ($sse2) {
picmeup(eax,OPENSSL_ia32cap);
but on my target (gcc/djgpp) that should actually be
_OPENSSL_ia32cap. I tried with prefixing with $under, but didn't
work since I know next to nothing
Andy Polyakov [EMAIL PROTECTED] said:
As far as I can see stock OpenSSL doesn't generate assembler moduler
for DJGPP, so you've got to tell more details about how do you generate
assembler modules. Note that picmeup is used in des assembler modules.
Can you figure out how it works there?
openssl speed sha-512:
type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes
With SSE2 disabled:
sha-512 1050.62k 4223.53k 6141.97k 8488.01k 9480.48k
with SSE2 enabled:
sha-512 3171.75k12757.93k22761.88k34514.56k40059.42k
Andy Polyakov [EMAIL PROTECTED] said:
400% on large blocks.
4x? What gcc version? 3x mentioned in commentary section is also for
largest block and with gcc 2.95.3. Well, not that 4x is worse result...
I used gcc 3.3.1 with -O2 -fno-strength-reduce -fomit-frame-pointer.
As for
I got that part. But AFAICS, when strtol(env,NULL,0) is used to set
OPENSSL_ia32cap and env = 0x0400, strtol() treats the value
as octal. From mn strtol:
The string may begin with an arbitrary amount of white space (as
determined by isspace(3)) followed by a single optional + or -
400% on large blocks.
4x? What gcc version? 3x mentioned in commentary section is also for
largest block and with gcc 2.95.3. Well, not that 4x is worse result...
I used gcc 3.3.1 with -O2 -fno-strength-reduce -fomit-frame-pointer.
Oh! I also get worse performance with 3.3.2, ~13 vs. 17MBps on
With SSE2 disabled:
openssl speed sha-512:
...
type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes
sha-512 1050.62k 4223.53k 6141.97k 8488.01k 9480.48k
with SSE2 enabled:
type 16 bytes 64 bytes256 bytes 1024 bytes 8192
11 matches
Mail list logo