I agree that it's not the most optimal, but at the same time no real
reason to fill bad about it.
But on the other hand I've done all the work to implement the macros
to do the PIC sequence properly. You really don't have to implement
anything.
BTW, two other points need restating:
1) My macros handle the non-PIC case optimally.
2) Your RAS corruption cost considerations are only considering
the most immediate effect on the return from the assembler
routine in question.
Whereas the true RAS miss cost must be multiplied onto the
next N functions up in the call chain, where N is the size
of the RAS. Since all of those will miss as well.
N is 4 on UltraSPARC. For comparison, in AES case depth from EVP_encrypt
to assembly code is 4, so that penalties don't spill on caller.
[Apparently we are talking about obsolete platform, as I measure no
performance difference between sequences depicted in last message on
T4.] All I'm saying is that it doesn't have to be classified as
"absolutely critical to fix." Basically, in the context I'd prefer not
to touch aes-sparcv9.pl and stick to "aesni" approach as the only one,
i.e. keep T4 code as separate module referred from EVP. It allows to
concentrate on things that matter, optimizing specific modes
performance. By extension it's preferred approach even for other ciphers.
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List [email protected]
Automated List Manager [email protected]