From: Andy Polyakov <ap...@openssl.org>
Date: Sun, 23 Sep 2012 22:53:53 +0200

>> The techniques used in this plain v9 implementation are:
>>
>> 1) Use little-endian 32-bit loads when input data is aligned.
>> 2) Avoid having to accumulate into the context hash values every
>>    loop iteration.
>> 3) In the aligned case try to seperate the loads from the first
>>    use by as many instructions as possible, without sacrificing
>>    the schedule too much.
>> 4) Attempt to dual-issue as much as possible on UltraSPARC-I/II/III/IV
>>    and SPARC-T4.
> 
> I had an old module lying around, dusted it off in
> http://cvs.openssl.org/chngview?cn=22842. It's 20% faster than your
> version on US pre-Tx. Improvement coefficient is likely to be even
> higher on T1, because it keeps everything in register bank and there
> are no loads except for input. Not really relevant, but it's nominally
> faster even on T4.

Could you discuss something like this before checking in such
changes instead of just silently dismissing work I've posted?
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       openssl-dev@openssl.org
Automated List Manager                           majord...@openssl.org

Reply via email to