>>>> I did observe more than 20% on Opteron, but on Core2/Sandy Bridge >>>> I get only 13-11%... >>> >>> Well, I've got 984 / 1170 clocks on Core 2 (17%) >>> and 1033 / 1250 on Core i5 (Westmere) (18%) >> >> Out of curiosity, how fast is updated code from CVS on Westmere? > > Sorry, too many codenames. It is Lynnfield.
Let's refer to "significant designs" instead. Among contemporary Intel cores one can recognize Core 2, Nehalem, Sandy Bridge, [Atom] ... Westmere, Lynnfield, Clarkdale, all fall to Nehalem category. > And the result exactly for Lynnfield is unexpected, Don't you feel sometimes that Intel mocks you? :-) :-) :-) > see below: > clocks for 1.5 / 1.6 / my version: > Core2 1170 / 1131 / 984 > Core i5 1250 / 1430 (!) / 1033 Ouch! http://cvs.openssl.org/chngview?cn=22597. > P4 Northwood 2108 / 2046 / 1957 This contradicts my tests. Specifically I measured slow-down for your code on P4. Though my P4 is first available model, while Northwood is later, improved core. Incidentally version 1.7 runs even faster on my P4, I measured 31->29 cpb improvement. Could you retest 1.7 on your P4? > AMD K10 1270 / 1200 / 1058 Just to clarify. Purpose of the exercise is not to dismiss the submission, but to figure out pros and cons on as many CPU implementations as possible. Though I admit I am a bit reluctant to ~10x size blow up, especially for small blocks... ______________________________________________________________________ OpenSSL Project http://www.openssl.org Development Mailing List [email protected] Automated List Manager [email protected]
