Ok. How about now?

Subject to SIGBUS on most platforms. It's easy to carry away and score on x86 and render support for other platforms void, isn't it? I mean do mind unaligned access!

I'm curious if there's a significant performance difference between using u32 and u64; the former should be portable to all supported platforms, and may make the latter unnecessary.

I'd recommend [or even insist] on for (i=0;i<16/sizeof(long);i++) loops and let compiler unroll them. 4x4-byte chunks on 32-bit platforms and 2x8-byte chunks - on 64-bit ones without a single shred of "#if that-or-that" spaghetti and no unnecessary dependency on totally unrelated bn.h. And once again, unaligned input/output is to be treated byte by byte.

Plus, if we're going to go that route, we should consider that some platforms have 128-bit XOR support in hardware; is it worth implementing that too?

Is it really that widely used/important mode? To justify that much extra complexity for little gain?

How much of this should be extended to other ciphers? Should xorN() and moveN() be part of the bignum code for reuse in other modules?

I'd be opposed to this. If performance gets that important, function call will hardly beat inline code anyway. Even if function is say 128-bit SSE2 and inline is just 4x32-bit. A.

______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       openssl-dev@openssl.org
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to