Hi,

BTW, have you considered synergetic implementation, which would work as
following. Arrange an intermediate buffer followed by non-accessible
page [commonly would be done with anonymous mmap of two pages followed
by mprotect(PROT_NONE) for the second page]. Upon *_init we call
software SHA*_Init. Then all short inputs go directly through software
SHA*_Update, while everything that is larger than certain value, say 256
bytes, is treated as following. Input stream is first "purged/aligned"
by running single pass of SHA*_Update till SHA*_CTX->data is full. Then
available 64-byte chunks are copied to the *bottom* of first page
mentioned above. Then we set up SEGV signal handler, let hardware suffer
from page fault and collect the intermediate hash values. The procedure
is repeated if more than pagesize was availalbe at a time.
SHA*_CTX->Nl,Nh are adjusted accordingly and remaning bytes [if any] are
fed again to software SHA*_Update. Upon *_final we just call *software*
SHA*_Final.

Man that's a wicked idea ;-) Though I'm not sure how xsha would survive
restarting after its segfault.

Well, the idea is rather to *not* restart it, but collect intermediate results and terminate it. Then this results are fed to either software or back to hardware as if it's a whole lot of new data, but with init values from previous step. The keyword is also to *never* let hardware do the final padding and final block calculation [which is why it always looks like a whole lot of data to hardware]. That's because hardware never knows correct Nl,Nh values used for final padding, only software does.

Are you sure it flushes the intermediate
results on exception? Well we can try ;-)

Manual says it does. Well, it doesn't say it flushes on SEGV in particular, but at low level processors don't normally distinguish SEGV, page fault or other exception. They just go like "oh! it's *an* exception, I flush, go kernel, call handler." Manual essentially says "I flush upon *an* exception."

Would such an approach work on all architectures (anonymous and
protected pages, sighandlers, ...)?

I don't know, but we can always make it conditionally available on explicitly tested architectures:-) You also have to realize that it also takes extra effort to make such implementation thread-safe. There are basically two options. 1. Allocate pages on per-thread basis [which would require unified API to per-thread storage, something we don't have]. 2. Serialize access to hardware [which we have unified API for]. As hardware is faster than network second one is perfectly viable option.

In the meantime could we go with the old fashioned patches that I sent
some time ago? I'll realign them with current CVS head (or 0.9.8 branch).

There were unanswered questions like support for SHA-224, test suite with public record that it passes, EVP_MD_FLAG_ONESHOT... But I don't have time to look into it right now, we have to do in May or something... A.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       openssl-dev@openssl.org
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to