From: Andy Polyakov
Date: Fri, 28 Sep 2012 21:00:18 +0200
> aes01 %key0,%reg0,%reg1,%reg2
> aes23 %key1,%reg0,%reg1,%reg1 <<< 1, not 3
> aes01 %key2,%reg2,%reg1,%reg0
> aes23 %key4,%reg2,%reg1,%reg1
>
> allows for 4x interleave up to 192-bit, right? 3*4+13*4=64? O
From: Andy Polyakov
Date: Fri, 28 Sep 2012 21:00:18 +0200
>> We don't have enough registers to unroll it by another factor,
>
> aes01 %key0,%reg0,%reg1,%reg2
> aes23 %key1,%reg0,%reg1,%reg1 <<< 1, not 3
> aes01 %key2,%reg2,%reg1,%reg0
> aes23 %key4,%reg2,%reg1,%re
What is rationale behind choosing interleave factor of two for
parallelizable modes? Judging from aes-128 cbc encrypt benchmarks AES
round instruction latency is 4. If processor can pair together two
half-round instructions (I refer to fact that it takes two instructions
to perform single round),
From: Andy Polyakov
Date: Fri, 28 Sep 2012 17:15:34 +0200
> Preprocessor isn't mighty enough on Solaris and we have to come up
> with alternative solution.
Are you really sure Solaris's CPP can't do proper pasting?
Perhaps there is a c99 mode or similar option that isn't being passed
in CFLAGS
From: Andy Polyakov
Date: Fri, 28 Sep 2012 17:15:34 +0200
> What is rationale behind choosing interleave factor of two for
> parallelizable modes? Judging from aes-128 cbc encrypt benchmarks AES
> round instruction latency is 4. If processor can pair together two
> half-round instructions (I refe
> This builds on top of the 7 patch series I sent the other day which
> laid the foundation for sparc crypto opcode support.
>
> The first patch plugs in optimized versions of key expansion and
> AES_{decrypt,encrypt}()
>
> The second patch is modelled on the AESNI support and explicitly
> optimi
This builds on top of the 7 patch series I sent the other day which
laid the foundation for sparc crypto opcode support.
The first patch plugs in optimized versions of key expansion and
AES_{decrypt,encrypt}()
The second patch is modelled on the AESNI support and explicitly
optimizes ECB, CBC, C