Joerg Schilling schrieb:
64 Bit code on Sparc is typically 5-10% slower, AMD64 code is typically 30%
faster because there are twice as much registers.
30% is very optimistic. My test results vary between 30% slower and 200%
faster depending on the application and compiler. On average I'd say AMD64
code will be ~10% faster.
My previous posted results with "openssl speed" are void. 32 bit code was
compiled with -xO3 while the 64 bit code was compiled with -xO5. I reran the
tests which on average still favour 64 bit code - but to a lesser extent.
Test environment:
cc: Sun C 5.8 Patch 121016-03 2006/06/07
ube: Sun Compiler Common 11 Patch 120759-08 2006/08/08
../gcc-4.1.1/configure --with-system-zlib --with-gnu-as
--with-as=/usr/sfw/bin/gas --without-included-gettext
--without-libiconv-prefix --enable-languages=c,c++,ada,fortran,objc --with-x
--enable-java-awt=xlib
Thread-Modell: posix
gcc-Version 4.1.1
AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ ( == Opteron 175)
2x1 GB RAM Dual Channel DDR400 CL3 ECC
Numbers below are relative performance AMD64 vs. IA32 (<0 IA32 faster, >0%
AMD64 faster)
(1) OpenSSL 0.9.8d
Studio 11 32 vs. 64 bits
./Configure no-asm solaris-x86-cc
./Configure no-asm solaris64-x86_64-cc
cc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -fast
-xstrconst [ -xarch=amd64 -Xa -DL_ENDIAN ]
type 16B 64B 256B 1024B 8192B
md2 -10.05% -11.15% -11.74% -12.12% -12.34%
md4 8.86% 6.05% 0.52% -5.88% -10.08%
md5 16.38% 10.94% 0.72% -8.88% -14.06%
hmac(md5) -2.63% -5.14% -8.79% -12.78% -14.68%
sha1 4.24% -11.21% -21.91% -26.25% -28.58%
rmd160 -1.22% -10.99% -20.55% -26.60% -29.35%
rc4 78.52% 82.69% 80.98% 81.75% 81.79%
des cbc -8.77% -9.57% -9.63% -9.69% -9.63%
idea cbc 6.43% 6.04% 6.02% 6.10% 5.85%
rc2 cbc -0.68% -1.16% -1.18% -1.27% -1.46%
blowfish cbc -7.59% -9.09% -9.35% -9.42% -9.98%
cast cbc -23.04% -24.26% -24.59% -25.31% -24.85%
aes-128 cbc 60.48% 61.71% 61.91% 62.32% 62.27%
aes-192 cbc 64.41% 63.91% 64.31% 65.11% 65.13%
aes-256 cbc 65.03% 66.60% 67.89% 67.40% 67.45%
sha256 -16.11% -19.27% -23.42% -25.54% -26.56%
sha512 82.83% 83.21% 112.24% 129.11% 137.42%
sign verify
rsa 512 bits 40.73% 28.55%
rsa 1024 bits 28.89% 17.55%
rsa 2048 bits 15.93% 3.47%
rsa 4096 bits 7.69% -3.87%
dsa 512 bits 29.38% 30.25%
dsa 1024 bits 20.51% 21.10%
dsa 2048 bits 7.06% 7.65%
gcc 4.1.1 32 vs. 64 bits
./Configure no-asm solaris-x86-gcc
./Configure no-asm solaris64-x86_64-gcc
gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -O3
-fomit-frame-pointer -DL_ENDIAN
{ -march=pentium -DOPENSSL_NO_INLINE_ASM |
-m64 -DL_ENDIAN -DMD32_REG_T=int }
type 16B 64B 256B 1024B 8192B
md2 -7.40% -9.46% -10.21% -9.36% -8.95%
md4 26.77% 24.25% 19.80% 14.47% 11.19%
md5 18.20% 16.72% 11.50% 6.06% 2.69%
hmac(md5) 19.03% 16.02% 10.69% 5.95% 2.59%
sha1 16.22% 13.13% 16.53% 20.12% 22.24%
rmd160 24.41% 17.51% 12.67% 8.13% 6.07%
rc4 22.65% 22.98% 23.10% 23.19% 23.17%
des cbc 38.35% 37.66% 37.36% 37.29% 37.11%
idea cbc 10.96% 6.71% 3.94% 3.69% 3.33%
rc2 cbc 1.53% 0.27% -0.23% -0.22% -0.33%
blowfish cbc 1.14% -1.38% -1.93% -2.16% -2.19%
cast cbc 95.12% 97.09% 97.57% 97.94% 98.07%
aes-128 cbc 76.22% 82.13% 83.89% 84.50% 84.79%
aes-192 cbc 84.24% 86.69% 88.12% 88.91% 89.08%
aes-256 cbc 83.59% 90.55% 91.96% 92.34% 92.52%
sha256 -3.48% -2.75% -1.07% -0.29% 0.09%
sha512 177.33% 177.60% 242.40% 279.34% 301.04%
sign verify
rsa 512 bits 94.92% 109.87%
rsa 1024 bits 124.20% 123.21%
rsa 2048 bits 136.36% 130.01%
rsa 4096 bits 142.86% 129.65%
dsa 512 bits 117.52% 114.45%
dsa 1024 bits 137.08% 128.02%
dsa 2048 bits 134.24% 130.59%
(2) gzip/bzip2
I did also measure compression/decompression speed with gzip and bzip2 (test
file: gcc-4.1.1.tar):
Studio 11 32 vs. 64 gcc 4.1.1 32 vs. 64
gzip -5.78 % 23.69 %
gunzip2 2.46 % 2.26 %
bzip2 3.47 % 4.71 %
bunzip2 10.38 % 12.12 %
gcc options: -O3 [ -m64 ]
cc options: -fast [ -xarch=amd64 ]
gzip-1.2.4a / bzip2-1.0.3
(3) Oracle 10g Release 2
And Oracle 10g Release 2 (10.2.0.2) 32 bit vs. 64 bit
(time for
@?/rdbms/admin/catalog.sql
@?/rdbms/admin/catproc.sql
on a newly created database. init.ora parameters were the same for
32 bit and 64 bit):
Time for catalog/catproc
32 bit 381s (user) 841s (real)
64 bit 365s (user) 833s (real)
--------------------------------------------
Speedup: 4.38% N/A
Conclusions:
(1) OpenSSL
40% slowdown up to 300% speedup. gcc64 results require further investigation.
Average is difficult to calculate. Some benchmarks are *much* faster in 64 bit
versions (RC4, AES, SHA512, RSA, DSA) others slower or nearly equal speed.
The 64 bit gcc results for OpenSSL are remarkeable. Let's compare them to the
64 bit Studio-11 results:
gcc 4.1.1 -m64 vs Studio-11 -xarch=amd64
type 16B 64B 256B 1024B 8192B
md2 2.31% 2.38% 2.17% 1.90% 1.95%
md4 26.94% 31.90% 42.53% 57.73% 69.77%
md5 15.15% 18.31% 26.14% 34.08% 38.99%
hmac(md5) 29.79% 29.06% 33.26% 37.32% 39.36%
sha1 14.51% 20.64% 31.24% 33.62% 35.94%
rmd160 30.72% 39.88% 54.02% 63.29% 68.00%
rc4 -14.06% -15.15% -15.19% -15.35% -15.40%
des cbc 4.39% 5.26% 5.42% 5.53% 5.41%
idea cbc -5.23% -9.13% -10.22% -10.49% -10.47%
rc2 cbc -5.78% -5.70% -5.95% -6.00% -5.91%
blowfish cbc 17.46% 18.34% 18.54% 18.63% 19.03%
cast cbc 42.77% 43.79% 43.92% 45.47% 44.42%
aes-128 cbc 13.49% 16.56% 17.83% 17.73% 18.04%
aes-192 cbc 15.29% 18.08% 18.68% 18.78% 19.05%
aes-256 cbc 15.41% 18.71% 19.12% 19.48% 19.71%
sha256 29.78% 34.61% 39.95% 43.14% 44.50%
sha512 29.04% 29.04% 33.51% 35.32% 37.11%
sign verify
rsa 512 bits 46.14% 53.01%
rsa 1024 bits 70.39% 67.51%
rsa 2048 bits 86.89% 89.64%
rsa 4096 bits 94.29% 95.69%
dsa 512 bits 66.00% 66.58%
dsa 1024 bits 81.30% 80.66%
dsa 2048 bits 91.30% 90.99%
Wow! I am shocked by the bad results of Studio 11 compared to gcc.
(2) gzip/bzip2
Speedup between -5% and 25%
Average speedup
for Studio 11: 2%
for gcc 4.1.1: 9%
(3) Oracle
~5% speedup in CPU time
Studio 11 vs. gcc 4.1.1: No clear winner. Perhaps gcc is generating better 64
bit code.
Code size: AMD64 code is ~20-30% (Studio 11) resp. 10% (gcc) larger. Studio-11
code is ~20% larger than gcc code:
$ size gzip-*
gzip-32.cc: 67210 + 5979 + 330505 = 403694
gzip-32.gcc: 58347 + 3036 + 330912 = 392295
gzip-64.cc: 80982 + 8275 + 332353 = 421610
gzip-64.gcc: 65463 + 5096 + 332668 = 403227
$ size bzip2-*
bzip2-32.cc: 105248 + 4366 + 5839 = 115453
bzip2-32.gcc: 79768 + 3588 + 5981 = 89337
bzip2-64.cc: 120572 + 4922 + 7455 = 132949
bzip2-64.gcc: 85096 + 3976 + 7625 = 96697
$ size openssl-*
openssl-32.cc: 1721518 + 75476 + 16332 = 1813326
openssl-32.gcc: 1523704 + 73864 + 17976 = 1615544
openssl-64.cc: 2257218 + 126616 + 18912 = 2402746
openssl-64.gcc: 1814240 + 124056 + 21192 = 1959488
$ size 10.2.0*/bin/oracle
10.2.0_64/bin/oracle: 94541351 + 2484717 + 34179 = 97060247
10.2.0_32/bin/oracle: 71377376 + 301249 + 27895 = 71706520
Daniel
_______________________________________________
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org