Hi Phil, I did not venture too much into new constructs for reasons of compatibility as you mentioned. A user will weight how much he gains by going with an incompatible extension. But I think this is the only way forward and I'm trying several alternatives.
My initial results for CIOCHASH show some improvements over the standard API. I was surprised to find that a small number of ioctls did contribute to the performance but not that much. Not calling the RNG in CIOCGSESSION and not having to look-up the session in next ioctls were apparently as significant (if not more). Cristian S. Here are some results I've got on a P5040RDB with caam driver (slightly modified tests/sha_speed.c): # sha_speed Testing SHA1 Hash: using algorithm sha1 with driver sha1-caam Encrypting in chunks of 256 bytes: done. 137.48 MB in 10.00 secs: 13.75 MB/sec Encrypting in chunks of 512 bytes: done. 274.74 MB in 10.00 secs: 27.47 MB/sec Encrypting in chunks of 1024 bytes: done. 521.38 MB in 10.00 secs: 52.14 MB/sec Encrypting in chunks of 2048 bytes: done. 967.60 MB in 10.00 secs: 96.76 MB/sec Encrypting in chunks of 4096 bytes: done. 1.63 GB in 10.00 secs: 0.16 GB/sec Encrypting in chunks of 8192 bytes: done. 2.55 GB in 10.00 secs: 0.26 GB/sec Encrypting in chunks of 65536 bytes: done. 5.09 GB in 10.00 secs: 0.51 GB/sec Testing SHA1 CIOCHASH: Encrypting in chunks of 256 bytes: done. 169.87 MB in 10.00 secs: 16.99 MB/sec Encrypting in chunks of 512 bytes: done. 339.46 MB in 10.00 secs: 33.95 MB/sec Encrypting in chunks of 1024 bytes: done. 642.85 MB in 10.00 secs: 64.29 MB/sec Encrypting in chunks of 2048 bytes: done. 1.18 GB in 10.00 secs: 0.12 GB/sec Encrypting in chunks of 4096 bytes: done. 1.91 GB in 10.00 secs: 0.19 GB/sec Encrypting in chunks of 8192 bytes: done. 2.89 GB in 10.00 secs: 0.29 GB/sec Encrypting in chunks of 65536 bytes: done. 5.24 GB in 10.00 secs: 0.52 GB/sec Testing SHA256 Hash: using algorithm sha256 with driver sha256-caam Encrypting in chunks of 256 bytes: done. 139.35 MB in 10.00 secs: 13.94 MB/sec Encrypting in chunks of 512 bytes: done. 278.88 MB in 10.00 secs: 27.89 MB/sec Encrypting in chunks of 1024 bytes: done. 552.42 MB in 10.00 secs: 55.24 MB/sec Encrypting in chunks of 2048 bytes: done. 1.00 GB in 10.00 secs: 0.10 GB/sec Encrypting in chunks of 4096 bytes: done. 1.73 GB in 10.00 secs: 0.17 GB/sec Encrypting in chunks of 8192 bytes: done. 2.79 GB in 10.00 secs: 0.28 GB/sec Encrypting in chunks of 65536 bytes: done. 6.03 GB in 10.00 secs: 0.60 GB/sec Testing SHA256 CIOCHASH: Encrypting in chunks of 256 bytes: done. 178.56 MB in 10.00 secs: 17.86 MB/sec Encrypting in chunks of 512 bytes: done. 337.22 MB in 10.00 secs: 33.72 MB/sec Encrypting in chunks of 1024 bytes: done. 673.49 MB in 10.00 secs: 67.35 MB/sec Encrypting in chunks of 2048 bytes: done. 1.22 GB in 10.00 secs: 0.12 GB/sec Encrypting in chunks of 4096 bytes: done. 2.02 GB in 10.00 secs: 0.20 GB/sec Encrypting in chunks of 8192 bytes: done. 3.17 GB in 10.00 secs: 0.32 GB/sec Encrypting in chunks of 65536 bytes: done. 6.22 GB in 10.00 secs: 0.62 GB/sec ________________________________________ From: Phil Sutter <p...@nwl.cc> Sent: Sunday, February 14, 2016 12:30 PM To: Cristian Stoica Cc: cryptodev-linux-devel@gna.org Subject: Re: [PATCH 0/2] RFC for CIOCHASH digest ioctl Hi Cristian, On Thu, Jan 21, 2016 at 12:58:55PM +0200, Cristian Stoica wrote: > Hi Phil, > > These two patches add CIOCHASH ioctl that targets only hash operations. It > is a proof of concept with the goal of improving digest performance and make > better > use of Linux crypto-API. Please ignore code duplication. > > My proposal aims to complete the design of the current cryptodev API so that > both init-update-finalize and digest operations are available. > > A smart user can calibrate how much data to buffer before calling update (to > offset > the ioctl costs) or can decide upfront for one method or the other. > (Openssl for example buffers the whole data and then calls cryptodev-update > through CIOCCRYPT). > > In a particular machine, cryptodev/tests/speed.c reports for SHA1 0.25GB/s > with CIOCCRYPT and > 0.29GB/s with CIOCHASH. > > Please share your feedback for improving this prototype. Thanks for providing this code, things are a bit more clear now. So you want to eliminate the overhead of CIOCGSESSION/CIOCFSESSION for single CIOCCRYPT operations, right? Have you considered creating a combined struct which contains both structs session_op and crypt_op instead of the new struct hash_op_data? Without a closer look, something like: | struct sscrypt_op { | struct session_op sess; | struct crypt_op cop; | }; might work in combination with a new ioctl CIOCSSCRYPT ("single-shot crypt"). This could then be used for crypto operations, as well. Do you have benchmark data for your RFC patch 2/2? The amount of code-duplication seems quite fair, and I doubt it's worth the effort since it does not eliminate any context switching (or does it?). Another concern might be low acceptance from cryptodev users. Openssl et al. use the generic cryptodev interface as it is provided by e.g. OCF, I'm not sure they want to implement support for a Linux-specific version. IMO this reduces the whole concept of compatibility to the generic cryptodev interface ad absurdum. That being said, the same applies to CIOCGSESSINFO as well, of course. Cheers, Phil _______________________________________________ Cryptodev-linux-devel mailing list Cryptodev-linux-devel@gna.org https://mail.gna.org/listinfo/cryptodev-linux-devel