From: Andy Polyakov
Date: Sat, 22 Sep 2012 22:02:55 +0200
> Basically, in the context I'd prefer not to touch aes-sparcv9.pl and
> stick to "aesni" approach as the only one, i.e. keep T4 code as
> separate module referred from EVP. It allows to concentrate on
> things that matter, optimizing spec
I agree that it's not the most optimal, but at the same time no real
reason to fill bad about it.
But on the other hand I've done all the work to implement the macros
to do the PIC sequence properly. You really don't have to implement
anything.
BTW, two other points need restating:
1) My macr
From: David Miller
Date: Sat, 22 Sep 2012 15:14:23 -0400 (EDT)
> From: Andy Polyakov
> Date: Sat, 22 Sep 2012 21:04:47 +0200
>
>> I agree that it's not the most optimal, but at the same time no real
>> reason to fill bad about it.
>
> But on the other hand I've done all the work to implement t
From: Andy Polyakov
Date: Sat, 22 Sep 2012 21:04:47 +0200
> I agree that it's not the most optimal, but at the same time no real
> reason to fill bad about it.
But on the other hand I've done all the work to implement the macros
to do the PIC sequence properly. You really don't have to implemen
About the RAS stack missing cost, every Sun produced UltraSPARC chip
pushes unconditionally onto the RAS and does not special case the
call.+8
pattern.
Thinking about this logically, a RAS miss can (at best) perform like a
full branch misprediction. Which on UltraSPARC results in a
From: Andy Polyakov
Date: Sat, 22 Sep 2012 20:11:11 +0200
> I wondered about specific mechanism on how it's achieved (does
> the montmul triggers window trap),
Yes, this is exactly what the instruction does.
It issues fill traps until the CANRESTORE register is NWINDOWS-2.
From: Andy Polyakov
Date: Sat, 22 Sep 2012 20:01:03 +0200
> And it seems to be the case here. Because (var-.Lpic) doesn't seem
> to work with external variables on SPARC Solaris. Unfortunate...
The simplistic existing expressions also won't work for
des_enc.m4's local tables once the DES opcode
But the main question was about how context switch is handled between
save and say mulmont. I mean the part after "save-s ought to allocate
frames."
I'm confused.
The cpu has 8 register windows.
This means that we can save down 7 times and fill all of the
registers in each window with the valu
I'll handle this, but differently. Specifically I won't go through GOT,
but directly to variable, something like this:
I would like to politely request that you don't go down this road.
.Lretl:
retl
nop
...
sethi %hi(var-.Lpic),%reg
.Lpic: call.Lretl
add
From: Andy Polyakov
Date: Sat, 22 Sep 2012 19:09:27 +0200
>>> No, before thinking about 32-bit mode, I quickly ask what's with
>>> save-s
>>> without arguments?
>> Sorry, I just wrote that code as pseudo-code off the top of my
>> head without attending to all of the necessary details.
>> We would
No, before thinking about 32-bit mode, I quickly ask what's with save-s
without arguments?
Sorry, I just wrote that code as pseudo-code off the top of my
head without attending to all of the necessary details.
We would indeed need to allocate a minimal stack frame in each
save instruction.
It'
The biggest trick here is providing the mechanism necessary to expand
the key properly.
The DES opcodes expect the expanded key to be in a different format
than the generic openssl DES code does.
So we use some include and CPP define trickey so that we can override
the key expansion in the cases
This will be used when supporting the sparc DES opcodes
as they expect the key to be expanded differently.
Signed-off-by: David S. Miller
---
Configure | 10 +-
crypto/des/Makefile |2 +-
2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/Configure b/Configur
Also, add a missing include of opensslconf.h so that we properly
get the OPENSSL_SYSNAME_ULTRASPARC define even in the 32-bit case.
These changes give a pretty reasonable speed boost.
On a SPARC T4-2, without these changes:
type 16 bytes 64 bytes256 bytes 1024 bytes 8192
DES took a little bit more work.
It stems from a common issue in that the DES opcodes expect the
expanded key to be in a different format from the one the generic DES
code puts it in.
Complicating things further, the fcrypt code cannot use the DES
opcodes because it wants the rounds computed in
15 matches
Mail list logo