>>>> I did observe more than 20% on Opteron, but on Core2/Sandy Bridge
>>>> I get only 13-11%...
>>>
>>> Well, I've got 984 / 1170 clocks on Core 2 (17%)
>>> and 1033 / 1250 on Core i5 (Westmere) (18%)
>>
>> Out of curiosity, how fast is updated code from CVS on Westmere?
> 
> Sorry, too many codenames. It is Lynnfield.

Let's refer to "significant designs" instead. Among contemporary Intel
cores one can recognize Core 2, Nehalem, Sandy Bridge, [Atom] ...
Westmere, Lynnfield, Clarkdale, all fall to Nehalem category.

> And the result exactly for Lynnfield is unexpected,

Don't you feel sometimes that Intel mocks you? :-) :-) :-)

> see below:
> clocks for 1.5 / 1.6 / my version:
> Core2 1170 / 1131 / 984
> Core i5 1250 / 1430 (!) / 1033

Ouch! http://cvs.openssl.org/chngview?cn=22597.

> P4 Northwood 2108 / 2046 / 1957

This contradicts my tests. Specifically I measured slow-down for your
code on P4. Though my P4 is first available model, while Northwood is
later, improved core. Incidentally version 1.7 runs even faster on my
P4, I measured 31->29 cpb improvement. Could you retest 1.7 on your P4?

> AMD K10 1270 / 1200 / 1058

Just to clarify. Purpose of the exercise is not to dismiss the
submission, but to figure out pros and cons on as many CPU
implementations as possible. Though I admit I am a bit reluctant to ~10x
size blow up, especially for small blocks...

______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [email protected]
Automated List Manager                           [email protected]

Reply via email to