As for Sandy Bridge. I don't know... I could observe nominal variations, 2-3%, on my machine, but nothing close to 10%, so this is odd... If you have energy, test with varying stack seed(*)...
It was my error, because I measured it in special application. It doesn't know about OPENSSL_ia32cap_P and goes on non-shrd path. The right numbers are 1005 for small loop and 971 for unrolled one. 971 is the best value I've ever seen! Great work!
Come on, apart from your Sandy Bridge result for 1.8, it's virtually equivalent. Nominal difference can be explained by environmental factors, and if not, it's really low price to pay for >40% improvement on Atom. Besides, it's actually "slow" architectures that need optimization more :-)
Now I agree ;) 1.8 version is "best-balanced" for all architectures. -- SY / C4acT/\uBo Pavel Semjanov _ _ _ http://www.semjanov.com | | |-| |_|_| |-| ______________________________________________________________________ OpenSSL Project http://www.openssl.org Development Mailing List [email protected] Automated List Manager [email protected]
