Interleaved are my results translated to your units, basically just
multiplied by 64 and rounded to three significant digits.
1.5 1.6 1.7 1.8 my
P III (Coppermime) 1821 / 1850 / 1742 / 1574 / 1614
1540
P4 (Prescott) 1544 / 1546 / 1541 / 1375 / 1450
1510
P4 (Northwood) 2200 / 1963 / 1931 / 2483 / 1957
1920
AMD Sempron 1537 / 1450 / 1394 / 1205 / 1305
n/a
AMD K10 1270 / 1210 / 1215 / 988 / 1057
990
Core 2 1170 / 1131 / 1130 / 985 / 984
1010
i5 Lynnfield 1250 / 1426 / 1271 / 1121 / 1033
1100
Sandy Bridge 1265 / 1225 / 1228 / 1115 / 981 (*) with shrd
1010 (folded loop with shrd)
Atom 2300 / 2050 / 1984 / 1700 / 2455
1660
Results are consistent except for P4, Core 2 and Sandy Bridge.
As for P4 it's probably just to shrug the shoulders, accept whatever the
result is and forget about it. It's a bit hard to accept, but it's
hardly worth figuring it out why our results vary that much.
As for Core 2. Difference is nominal and if I execute my binary with
varying stack seed(*) I can also measure 990 cycles per block. In other
words variation can be explained by environmental factors such as cache
contention.
As for Sandy Bridge. I don't know... I could observe nominal variations,
2-3%, on my machine, but nothing close to 10%, so this is odd... If you
have energy, test with varying stack seed(*)...
(*) because environment variables reside below stack simplest way to
reseed stack is to 'env A=`perl -e 'print "A"x1024"'` ...' and
experiment with number after x.
So, 1.8 version is quite good. It's the best for almost all old/slow
architectures, and my version is still the best for modern/powerful ones.
Come on, apart from your Sandy Bridge result for 1.8, it's virtually
equivalent. Nominal difference can be explained by environmental
factors, and if not, it's really low price to pay for >40% improvement
on Atom. Besides, it's actually "slow" architectures that need
optimization more :-)
Cheers.
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List [email protected]
Automated List Manager [email protected]