I've been thinking of playing with improving the speed of OpenBSD's cryptography primitives. My tentative plans:
- benchmark aes-ctr performance with current code vs. optimized assembly code (e.g., just hacking sys/crypto/rijndael.c to use optimized code); if no significant improvement, abort - add new drivers that attach on specific CPUs and hook into the crypto framework to provide optimized implementations - add support for i386/amd64 to allow limited FPU/MMX/SSE use in the kernel - repeat above, including implementations that take advantage of FPU/MMX/SSE instructions - experiment with adding new stream ciphers (e.g., Salsa20) - experiment with making swcr_encdec better aware of stream ciphers to avoid unnecessary copying to fit block-size or padding - experiment with adding new MACs (e.g., Poly1305) My long term goal/hope is to speed up IPsec, but in the interim I only have one machine to work with, so for now I'll probably just measure the time it takes to handle requests from user-space. If anyone has feedback/suggestions on the above plans, I'm happy to hear them.