Hi Phil,

I did not venture too much into new constructs for reasons of compatibility as 
you mentioned. A user will weight how much he gains by going with an 
incompatible extension. But I think this is the only way forward and I'm trying 
several alternatives.

My initial results for CIOCHASH show some improvements over the standard API. I 
was surprised to find that a small number of ioctls did contribute to the 
performance but not that much. Not calling the RNG in CIOCGSESSION and not 
having to look-up the session in next ioctls were apparently as significant (if 
not more).

Cristian S.


Here are some results I've got on a P5040RDB with caam driver (slightly 
modified tests/sha_speed.c):

# sha_speed 
Testing SHA1 Hash: 
using algorithm sha1 with driver sha1-caam
        Encrypting in chunks of 256 bytes: done. 137.48 MB in 10.00 secs: 13.75 
MB/sec
        Encrypting in chunks of 512 bytes: done. 274.74 MB in 10.00 secs: 27.47 
MB/sec
        Encrypting in chunks of 1024 bytes: done. 521.38 MB in 10.00 secs: 
52.14 MB/sec
        Encrypting in chunks of 2048 bytes: done. 967.60 MB in 10.00 secs: 
96.76 MB/sec
        Encrypting in chunks of 4096 bytes: done. 1.63 GB in 10.00 secs: 0.16 
GB/sec
        Encrypting in chunks of 8192 bytes: done. 2.55 GB in 10.00 secs: 0.26 
GB/sec
        Encrypting in chunks of 65536 bytes: done. 5.09 GB in 10.00 secs: 0.51 
GB/sec
Testing SHA1 CIOCHASH: 
        Encrypting in chunks of 256 bytes: done. 169.87 MB in 10.00 secs: 16.99 
MB/sec
        Encrypting in chunks of 512 bytes: done. 339.46 MB in 10.00 secs: 33.95 
MB/sec
        Encrypting in chunks of 1024 bytes: done. 642.85 MB in 10.00 secs: 
64.29 MB/sec
        Encrypting in chunks of 2048 bytes: done. 1.18 GB in 10.00 secs: 0.12 
GB/sec
        Encrypting in chunks of 4096 bytes: done. 1.91 GB in 10.00 secs: 0.19 
GB/sec
        Encrypting in chunks of 8192 bytes: done. 2.89 GB in 10.00 secs: 0.29 
GB/sec
        Encrypting in chunks of 65536 bytes: done. 5.24 GB in 10.00 secs: 0.52 
GB/sec
Testing SHA256 Hash: 
using algorithm sha256 with driver sha256-caam
        Encrypting in chunks of 256 bytes: done. 139.35 MB in 10.00 secs: 13.94 
MB/sec
        Encrypting in chunks of 512 bytes: done. 278.88 MB in 10.00 secs: 27.89 
MB/sec
        Encrypting in chunks of 1024 bytes: done. 552.42 MB in 10.00 secs: 
55.24 MB/sec
        Encrypting in chunks of 2048 bytes: done. 1.00 GB in 10.00 secs: 0.10 
GB/sec
        Encrypting in chunks of 4096 bytes: done. 1.73 GB in 10.00 secs: 0.17 
GB/sec
        Encrypting in chunks of 8192 bytes: done. 2.79 GB in 10.00 secs: 0.28 
GB/sec
        Encrypting in chunks of 65536 bytes: done. 6.03 GB in 10.00 secs: 0.60 
GB/sec
Testing SHA256 CIOCHASH: 
        Encrypting in chunks of 256 bytes: done. 178.56 MB in 10.00 secs: 17.86 
MB/sec
        Encrypting in chunks of 512 bytes: done. 337.22 MB in 10.00 secs: 33.72 
MB/sec
        Encrypting in chunks of 1024 bytes: done. 673.49 MB in 10.00 secs: 
67.35 MB/sec
        Encrypting in chunks of 2048 bytes: done. 1.22 GB in 10.00 secs: 0.12 
GB/sec
        Encrypting in chunks of 4096 bytes: done. 2.02 GB in 10.00 secs: 0.20 
GB/sec
        Encrypting in chunks of 8192 bytes: done. 3.17 GB in 10.00 secs: 0.32 
GB/sec
        Encrypting in chunks of 65536 bytes: done. 6.22 GB in 10.00 secs: 0.62 
GB/sec



________________________________________
From: Phil Sutter <p...@nwl.cc>
Sent: Sunday, February 14, 2016 12:30 PM
To: Cristian Stoica
Cc: cryptodev-linux-devel@gna.org
Subject: Re: [PATCH 0/2] RFC for CIOCHASH digest ioctl

Hi Cristian,

On Thu, Jan 21, 2016 at 12:58:55PM +0200, Cristian Stoica wrote:
> Hi Phil,
>
> These two patches add CIOCHASH ioctl that targets only hash operations. It
> is a proof of concept with the goal of improving digest performance and make 
> better
> use of Linux crypto-API. Please ignore code duplication.
>
> My proposal aims to complete the design of the current cryptodev API so that
> both init-update-finalize and digest operations are available.
>
> A smart user can calibrate how much data to buffer before calling update (to 
> offset
> the ioctl costs) or can decide upfront for one method or the other.
> (Openssl for example buffers the whole data and then calls cryptodev-update 
> through CIOCCRYPT).
>
> In a particular machine, cryptodev/tests/speed.c reports for SHA1 0.25GB/s 
> with CIOCCRYPT and
> 0.29GB/s with CIOCHASH.
>
> Please share your feedback for improving this prototype.

Thanks for providing this code, things are a bit more clear now.

So you want to eliminate the overhead of CIOCGSESSION/CIOCFSESSION for
single CIOCCRYPT operations, right? Have you considered creating a
combined struct which contains both structs session_op and crypt_op
instead of the new struct hash_op_data? Without a closer look, something
like:

| struct sscrypt_op {
|       struct session_op sess;
|       struct crypt_op cop;
| };

might work in combination with a new ioctl CIOCSSCRYPT ("single-shot
crypt"). This could then be used for crypto operations, as well.

Do you have benchmark data for your RFC patch 2/2? The amount of
code-duplication seems quite fair, and I doubt it's worth the effort
since it does not eliminate any context switching (or does it?).

Another concern might be low acceptance from cryptodev users. Openssl
et al. use the generic cryptodev interface as it is provided by e.g.
OCF, I'm not sure they want to implement support for a Linux-specific
version. IMO this reduces the whole concept of compatibility to the
generic cryptodev interface ad absurdum. That being said, the same
applies to CIOCGSESSINFO as well, of course.

Cheers, Phil

_______________________________________________
Cryptodev-linux-devel mailing list
Cryptodev-linux-devel@gna.org
https://mail.gna.org/listinfo/cryptodev-linux-devel

Reply via email to