On Sun, Dec 06, 2020 at 11:45:23PM +0100, Ard Biesheuvel wrote:
> Follow the same approach as the arm64 driver for implementing a version
> of AES-NI in CBC mode that supports ciphertext stealing. Compared to the
> generic CTS template wrapped around the existing cbc-aes-aesni skcipher,
> this results in a ~2x speed increase for relatively short inputs (less
> than 256 bytes), which is relevant given that AES-CBC with ciphertext
> stealing is used for filename encryption in the fscrypt layer. For larger
> inputs, the speedup is still significant (~25% on decryption, ~6% on
> encryption).
> 
> Signed-off-by: Ard Biesheuvel <a...@kernel.org>
> ---
> Full tcrypt benchmark results for cts(cbc-aes-aesni) vs cts-cbc-aes-aesni
> after the diff (Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz)
> 
>  arch/x86/crypto/aesni-intel_asm.S  |  87 +++++++++++++
>  arch/x86/crypto/aesni-intel_glue.c | 133 ++++++++++++++++++++
>  2 files changed, 220 insertions(+)

This is passing the self-tests (including the extra tests), and it's definitely
faster, and would be useful for fscrypt.  I did my own benchmarks and got

Encryption:

        Message size  Before (MB/s)  After (MB/s)
        ------------  -------------  ------------
        32            136.83         273.04
        64            230.03         262.04
        128           372.92         487.71
        256           541.41         652.95

Decryption:

        Message size  Before (MB/s)  After (MB/s)
        ------------  -------------  ------------
        32            121.95         280.04
        64            208.72         279.72
        128           397.98         635.79
        256           723.09         1105.05

(This was with "Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz")

So feel free to add:

Tested-by: Eric Biggers <ebigg...@google.com>

I might not have time to fully review this, but one comment below:

> +static int cts_cbc_encrypt(struct skcipher_request *req)
> +{
[...]
> +static int cts_cbc_decrypt(struct skcipher_request *req)
> +{
[...]
>  #ifdef CONFIG_X86_64
> +     }, {
> +             .base = {
> +                     .cra_name               = "__cts(cbc(aes))",
> +                     .cra_driver_name        = "__cts-cbc-aes-aesni",
> +                     .cra_priority           = 400,
> +                     .cra_flags              = CRYPTO_ALG_INTERNAL,
> +                     .cra_blocksize          = AES_BLOCK_SIZE,
> +                     .cra_ctxsize            = CRYPTO_AES_CTX_SIZE,
> +                     .cra_module             = THIS_MODULE,
> +             },
> +             .min_keysize    = AES_MIN_KEY_SIZE,
> +             .max_keysize    = AES_MAX_KEY_SIZE,
> +             .ivsize         = AES_BLOCK_SIZE,
> +             .walksize       = 2 * AES_BLOCK_SIZE,
> +             .setkey         = aesni_skcipher_setkey,
> +             .encrypt        = cts_cbc_encrypt,
> +             .decrypt        = cts_cbc_decrypt,

The algorithm is conditional on CONFIG_X86_64, but the function definitions
aren't.

It needs to be one way or the other, otherwise there will be a compiler warning
on 32-bit builds.

- Eric

Reply via email to