Re: [PATCH v3 4/4] crypto: add CRYPTO_TFM_REQ_IV_SERIALIZE flag
On 10-02-2018 03:34, Stephan Müller wrote: > Crypto drivers may implement a streamlined serialization support for AIO > requests that is reported by the CRYPTO_ALG_SERIALIZES_IV_ACCESS flag to > the crypto user. When the user decides that he wants to send multiple > AIO requests concurrently and wants the crypto driver to handle the > serialization, the caller has to set CRYPTO_TFM_REQ_IV_SERIALIZE to notify > the crypto driver. Will crypto_alloc_* API takes cares of this flag?. For Kernel Crypto user IV Synchronization logic depends on weather tfm allocated supports IV Serialisation or not. > > Only when this flag is enabled, the crypto driver shall apply its > serialization logic for handling IV updates between requests. If this > flag is not provided, the serialization logic shall not be applied by > the driver as the caller decides that it does not need it (because no > parallel AIO requests are sent) or that it performs its own > serialization. > > Signed-off-by: Stephan Mueller > --- > crypto/algif_aead.c | 15 --- > crypto/algif_skcipher.c | 15 --- > include/linux/crypto.h | 1 + > 3 files changed, 25 insertions(+), 6 deletions(-) > > diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c > index 619147792cc9..5ec4dec6e6a1 100644 > --- a/crypto/algif_aead.c > +++ b/crypto/algif_aead.c > @@ -66,13 +66,22 @@ static int aead_sendmsg(struct socket *sock, struct > msghdr *msg, size_t size) > { > struct sock *sk = sock->sk; > struct alg_sock *ask = alg_sk(sk); > + struct af_alg_ctx *ctx = ask->private; > struct sock *psk = ask->parent; > struct alg_sock *pask = alg_sk(psk); > struct aead_tfm *aeadc = pask->private; > - struct crypto_aead *tfm = aeadc->aead; > - unsigned int ivsize = crypto_aead_ivsize(tfm); > + struct crypto_aead *aead = aeadc->aead; > + struct crypto_tfm *tfm = crypto_aead_tfm(aead); > + unsigned int ivsize = crypto_aead_ivsize(aead); > + int ret = af_alg_sendmsg(sock, msg, size, ivsize); > + > + if (ret < 0) > + return ret; > > - return af_alg_sendmsg(sock, msg, size, ivsize); > + if (ctx->iiv == ALG_IV_SERIAL_PROCESSING) > + tfm->crt_flags |= CRYPTO_TFM_REQ_IV_SERIALIZE; > + > + return ret; > } > > static int crypto_aead_copy_sgl(struct crypto_skcipher *null_tfm, > diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c > index cf27dda6a181..fd2a0ba32feb 100644 > --- a/crypto/algif_skcipher.c > +++ b/crypto/algif_skcipher.c > @@ -43,12 +43,21 @@ static int skcipher_sendmsg(struct socket *sock, struct > msghdr *msg, > { > struct sock *sk = sock->sk; > struct alg_sock *ask = alg_sk(sk); > + struct af_alg_ctx *ctx = ask->private; > struct sock *psk = ask->parent; > struct alg_sock *pask = alg_sk(psk); > - struct crypto_skcipher *tfm = pask->private; > - unsigned ivsize = crypto_skcipher_ivsize(tfm); > + struct crypto_skcipher *skc = pask->private; > + struct crypto_tfm *tfm = crypto_skcipher_tfm(skc); > + unsigned int ivsize = crypto_skcipher_ivsize(skc); > + int ret = af_alg_sendmsg(sock, msg, size, ivsize); > + > + if (ret < 0) > + return ret; > > - return af_alg_sendmsg(sock, msg, size, ivsize); > + if (ctx->iiv == ALG_IV_SERIAL_PROCESSING) > + tfm->crt_flags |= CRYPTO_TFM_REQ_IV_SERIALIZE; > + > + return ret; > } > > static int _skcipher_recvmsg(struct socket *sock, struct msghdr *msg, > diff --git a/include/linux/crypto.h b/include/linux/crypto.h > index 4860aa2c9be4..4d54f2b30692 100644 > --- a/include/linux/crypto.h > +++ b/include/linux/crypto.h > @@ -133,6 +133,7 @@ > #define CRYPTO_TFM_REQ_WEAK_KEY 0x0100 > #define CRYPTO_TFM_REQ_MAY_SLEEP 0x0200 > #define CRYPTO_TFM_REQ_MAY_BACKLOG 0x0400 > +#define CRYPTO_TFM_REQ_IV_SERIALIZE 0x0800 > #define CRYPTO_TFM_RES_WEAK_KEY 0x0010 > #define CRYPTO_TFM_RES_BAD_KEY_LEN 0x0020 > #define CRYPTO_TFM_RES_BAD_KEY_SCHED 0x0040
Re: [PATCH v3 1/4] crypto: AF_ALG AIO - lock context IV
On 10-02-2018 03:33, Stephan Müller wrote: > The kernel crypto API requires the caller to set an IV in the request data > structure. That request data structure shall define one particular cipher > operation. During the cipher operation, the IV is read by the cipher > implementation and eventually the potentially updated IV (e.g. in case of > CBC) is written back to the memory location the request data structure > points to. > > AF_ALG allows setting the IV with a sendmsg request, where the IV is stored > in the AF_ALG context that is unique to one particular AF_ALG socket. Note > the analogy: an AF_ALG socket is like a TFM where one recvmsg operation > uses one request with the TFM from the socket. > > AF_ALG these days supports AIO operations with multiple IOCBs. I.e. with > one recvmsg call, multiple IOVECs can be specified. Each individual IOCB > (derived from one IOVEC) implies that one request data structure is > created with the data to be processed by the cipher implementation. The > IV that was set with the sendmsg call is registered with the request data > structure before the cipher operation. > > In case of an AIO operation, the cipher operation invocation returns > immediately, queuing the request to the hardware. While the AIO request is > processed by the hardware, recvmsg processes the next IOVEC for which > another request is created. Again, the IV buffer from the AF_ALG socket > context is registered with the new request and the cipher operation is > invoked. > > You may now see that there is a potential race condition regarding the IV > handling, because there is *no* separate IV buffer for the different > requests. This is nicely demonstrated with libkcapi using the following > command which creates an AIO request with two IOCBs each encrypting one > AES block in CBC mode: > > kcapi -d 2 -x 9 -e -c "cbc(aes)" -k > 8d7dd9b0170ce0b5f2f8e1aa768e01e91da8bfc67fd486d081b28254c99eb423 -i > 7fbc02ebf5b93322329df9bfccb635af -p 48981da18e4bb9ef7e2e3162d16b1910 > > When the first AIO request finishes before the 2nd AIO request is > processed, the returned value is: > > 8b19050f66582cb7f7e4b6c873819b7108afa0eaa7de29bac7d903576b674c32 > > I.e. two blocks where the IV output from the first request is the IV input > to the 2nd block. > > In case the first AIO request is not completed before the 2nd request > commences, the result is two identical AES blocks (i.e. both use the same > IV): > > 8b19050f66582cb7f7e4b6c873819b718b19050f66582cb7f7e4b6c873819b71 > > This inconsistent result may even lead to the conclusion that there can be > a memory corruption in the IV buffer if both AIO requests write to the IV > buffer at the same time. > > As the AF_ALG interface is used by user space, a mutex provides the > serialization which puts the caller to sleep in case a previous IOCB > processing is not yet finished. > > If multiple IOCBs arrive that all are blocked, the mutex' FIFO operation > of processing arriving requests ensures that the blocked IOCBs are > unblocked in the right order. Hi Stephen, Patch set is working fine with chelsio Driver. Do we really need IV locking mechanism for AEAD algo because AEAD algo's don't support Partial mode operation and Driver are not updating(atleast Chelsio) IV's on AEAD request completions. > > CC: #4.14 > Fixes: e870456d8e7c8 ("crypto: algif_skcipher - overhaul memory management") > Fixes: d887c52d6ae43 ("crypto: algif_aead - overhaul memory management") > Signed-off-by: Stephan Mueller > --- > crypto/af_alg.c | 31 +++ > crypto/algif_aead.c | 20 +--- > crypto/algif_skcipher.c | 12 +--- > include/crypto/if_alg.h | 5 + > 4 files changed, 58 insertions(+), 10 deletions(-) > > diff --git a/crypto/af_alg.c b/crypto/af_alg.c > index 5231f421ad00..e7887621aa44 100644 > --- a/crypto/af_alg.c > +++ b/crypto/af_alg.c > @@ -1051,6 +1051,8 @@ void af_alg_async_cb(struct crypto_async_request *_req, > int err) > struct kiocb *iocb = areq->iocb; > unsigned int resultlen; > > + af_alg_put_iv(sk); > + > /* Buffer size written by crypto operation. */ > resultlen = areq->outlen; > > @@ -1175,6 +1177,35 @@ int af_alg_get_rsgl(struct sock *sk, struct msghdr > *msg, int flags, > } > EXPORT_SYMBOL_GPL(af_alg_get_rsgl); > > +/** > + * af_alg_get_iv > + * > + * @sk [in] AF_ALG socket > + * @return 0 on success, < 0 on error > + */ > +int af_alg_get_iv(struct sock *sk) > +{ > + struct alg_sock *ask = alg_sk(sk); > + struct af_alg_ctx *ctx = ask->private; > + > + return mutex_lock_interruptible(&ctx->ivlock); > +} > +EXPORT_SYMBOL_GPL(af_alg_get_iv); > + > +/** > + * af_alg_put_iv - release lock on IV in case CTX IV is used > + * > + * @sk [in] AF_ALG socket > + */ > +void af_alg_put_iv(struct sock *sk) > +{ > + struct alg_sock *ask = alg_sk(sk); > + struct af_alg_ctx *ctx = ask->private; > + > + mutex_unlock(&ctx->ivlock); > +} > +EXPORT_SYMBOL_GPL(
Re: [PATCH 14/14] x86/crypto: aesni: Update aesni-intel_glue to use scatter/gather
[Resending after delivery failure] Hi Dave, On 02/13/2018 10:22 AM, Dave Watson wrote: > > Yes, these both sound reasonable. I will send a V2. > > Thanks! Another minor suggestion for v2: It might be a good idea to check if the first assoclen bytes are already contiguous and only do the kmalloc if that isn't the case. Thanks, Junaid
Re: [PATCH v2 3/5] crypto: arm/speck - add NEON-accelerated implementation of Speck-XTS
On 13 February 2018 at 18:57, Eric Biggers wrote: > Hi Ard, > > On Tue, Feb 13, 2018 at 11:34:36AM +, Ard Biesheuvel wrote: >> Hi Eric, >> >> On 12 February 2018 at 23:52, Eric Biggers wrote: >> > Add an ARM NEON-accelerated implementation of Speck-XTS. It operates on >> > 128-byte chunks at a time, i.e. 8 blocks for Speck128 or 16 blocks for >> > Speck64. Each 128-byte chunk goes through XTS preprocessing, then is >> > encrypted/decrypted (doing one cipher round for all the blocks, then the >> > next round, etc.), then goes through XTS postprocessing. >> > >> > The performance depends on the processor but can be about 3 times faster >> > than the generic code. For example, on an ARMv7 processor we observe >> > the following performance with Speck128/256-XTS: >> > >> > xts-speck128-neon: Encryption 107.9 MB/s, Decryption 108.1 MB/s >> > xts(speck128-generic): Encryption 32.1 MB/s, Decryption 36.6 MB/s >> > >> > In comparison to AES-256-XTS without the Cryptography Extensions: >> > >> > xts-aes-neonbs:Encryption 41.2 MB/s, Decryption 36.7 MB/s >> > xts(aes-asm): Encryption 31.7 MB/s, Decryption 30.8 MB/s >> > xts(aes-generic): Encryption 21.2 MB/s, Decryption 20.9 MB/s >> > >> > Speck64/128-XTS is even faster: >> > >> > xts-speck64-neon: Encryption 138.6 MB/s, Decryption 139.1 MB/s >> > >> > Note that as with the generic code, only the Speck128 and Speck64 >> > variants are supported. Also, for now only the XTS mode of operation is >> > supported, to target the disk and file encryption use cases. The NEON >> > code also only handles the portion of the data that is evenly divisible >> > into 128-byte chunks, with any remainder handled by a C fallback. Of >> > course, other modes of operation could be added later if needed, and/or >> > the NEON code could be updated to handle other buffer sizes. >> > >> > The XTS specification is only defined for AES which has a 128-bit block >> > size, so for the GF(2^64) math needed for Speck64-XTS we use the >> > reducing polynomial 'x^64 + x^4 + x^3 + x + 1' given by the original XEX >> > paper. Of course, when possible users should use Speck128-XTS, but even >> > that may be too slow on some processors; Speck64-XTS can be faster. >> > >> >> I think this is excellent work. Speck seems an appropriate solution to >> this problem, and I'm glad we are not ending up with a stream cipher >> for block encryption. >> >> Also, I think an arm64 port would be nice. I may take a stab at this >> if nobody else beats me to it. > > We don't really want to encourage people to use Speck over AES with the > Cryptography Extensions, so that's why I didn't include an arm64 port. That > being said, I suppose we can't stop people from adding an arm64 port if they > really do prefer Speck, or maybe for use on arm64 CPUs that don't have the > Cryptography Extensions (though I thought that almost all do). > Many do, but not all of them. A notable exception is the Raspberry Pi 3. >> >> I did run into an issue with this code though: On big-endian, I get >> >> [ 0.272381] alg: skcipher: Test 1 failed (invalid result) on >> encryption for xts-speck64-neon >> [0.276151] : 84 af 54 07 19 d4 7c a6 9c 8a ac f6 c2 14 04 d8 >> [0.278541] 0010: 7f 18 6c 43 56 ed 0b b3 92 21 a2 d9 17 59 e4 3b >> >> so there may be a byte order corner case you missed in the rewrite (or >> the issue existed before, as I did not test your v1) >> > > To be honest I haven't tested either version on a big endian ARM CPU yet. I > don't really know how to do that currently; maybe it's possible with QEMU. > I tested this on a big-endian 32-bit VM running under KVM on a 64-bit host. > But assuming I haven't missed anything, in the assembly code everything is > treated as byte arrays with the exception of the round keys which are 32-bit > or > 64-bit numbers in CPU endianness. The byte arrays are loaded and stored with > vld1.8 and vst1.8 while the round keys are loaded with vld1.32 or vld1.64, so > the assembly code *should* work correctly on a big endian CPU. > Indeed. > However, looking over it now, I think there is a bug in the glue code for > Speck64-XTS when it handles buffers not evenly divisible into 128 bytes. > Namely, the tweak is treated as CPU endian when it should be little endian. > Could you try the following patch? > > diff --git a/arch/arm/crypto/speck-neon-glue.c > b/arch/arm/crypto/speck-neon-glue.c > index 3987dd6e063e..960cc634b36f 100644 > --- a/arch/arm/crypto/speck-neon-glue.c > +++ b/arch/arm/crypto/speck-neon-glue.c > @@ -157,7 +157,7 @@ __speck64_xts_crypt(struct skcipher_request *req, > speck64_crypt_one_t crypt_one, > struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); > const struct speck64_xts_tfm_ctx *ctx = crypto_skcipher_ctx(tfm); > struct skcipher_walk walk; > - u64 tweak; > + __le64 tweak; > int err; > > err = skcipher_walk_virt(&walk,
Re: [PATCH v2 3/5] crypto: arm/speck - add NEON-accelerated implementation of Speck-XTS
Hi Ard, On Tue, Feb 13, 2018 at 11:34:36AM +, Ard Biesheuvel wrote: > Hi Eric, > > On 12 February 2018 at 23:52, Eric Biggers wrote: > > Add an ARM NEON-accelerated implementation of Speck-XTS. It operates on > > 128-byte chunks at a time, i.e. 8 blocks for Speck128 or 16 blocks for > > Speck64. Each 128-byte chunk goes through XTS preprocessing, then is > > encrypted/decrypted (doing one cipher round for all the blocks, then the > > next round, etc.), then goes through XTS postprocessing. > > > > The performance depends on the processor but can be about 3 times faster > > than the generic code. For example, on an ARMv7 processor we observe > > the following performance with Speck128/256-XTS: > > > > xts-speck128-neon: Encryption 107.9 MB/s, Decryption 108.1 MB/s > > xts(speck128-generic): Encryption 32.1 MB/s, Decryption 36.6 MB/s > > > > In comparison to AES-256-XTS without the Cryptography Extensions: > > > > xts-aes-neonbs:Encryption 41.2 MB/s, Decryption 36.7 MB/s > > xts(aes-asm): Encryption 31.7 MB/s, Decryption 30.8 MB/s > > xts(aes-generic): Encryption 21.2 MB/s, Decryption 20.9 MB/s > > > > Speck64/128-XTS is even faster: > > > > xts-speck64-neon: Encryption 138.6 MB/s, Decryption 139.1 MB/s > > > > Note that as with the generic code, only the Speck128 and Speck64 > > variants are supported. Also, for now only the XTS mode of operation is > > supported, to target the disk and file encryption use cases. The NEON > > code also only handles the portion of the data that is evenly divisible > > into 128-byte chunks, with any remainder handled by a C fallback. Of > > course, other modes of operation could be added later if needed, and/or > > the NEON code could be updated to handle other buffer sizes. > > > > The XTS specification is only defined for AES which has a 128-bit block > > size, so for the GF(2^64) math needed for Speck64-XTS we use the > > reducing polynomial 'x^64 + x^4 + x^3 + x + 1' given by the original XEX > > paper. Of course, when possible users should use Speck128-XTS, but even > > that may be too slow on some processors; Speck64-XTS can be faster. > > > > I think this is excellent work. Speck seems an appropriate solution to > this problem, and I'm glad we are not ending up with a stream cipher > for block encryption. > > Also, I think an arm64 port would be nice. I may take a stab at this > if nobody else beats me to it. We don't really want to encourage people to use Speck over AES with the Cryptography Extensions, so that's why I didn't include an arm64 port. That being said, I suppose we can't stop people from adding an arm64 port if they really do prefer Speck, or maybe for use on arm64 CPUs that don't have the Cryptography Extensions (though I thought that almost all do). > > I did run into an issue with this code though: On big-endian, I get > > [0.272381] alg: skcipher: Test 1 failed (invalid result) on > encryption for xts-speck64-neon > [0.276151] : 84 af 54 07 19 d4 7c a6 9c 8a ac f6 c2 14 04 d8 > [0.278541] 0010: 7f 18 6c 43 56 ed 0b b3 92 21 a2 d9 17 59 e4 3b > > so there may be a byte order corner case you missed in the rewrite (or > the issue existed before, as I did not test your v1) > To be honest I haven't tested either version on a big endian ARM CPU yet. I don't really know how to do that currently; maybe it's possible with QEMU. But assuming I haven't missed anything, in the assembly code everything is treated as byte arrays with the exception of the round keys which are 32-bit or 64-bit numbers in CPU endianness. The byte arrays are loaded and stored with vld1.8 and vst1.8 while the round keys are loaded with vld1.32 or vld1.64, so the assembly code *should* work correctly on a big endian CPU. However, looking over it now, I think there is a bug in the glue code for Speck64-XTS when it handles buffers not evenly divisible into 128 bytes. Namely, the tweak is treated as CPU endian when it should be little endian. Could you try the following patch? diff --git a/arch/arm/crypto/speck-neon-glue.c b/arch/arm/crypto/speck-neon-glue.c index 3987dd6e063e..960cc634b36f 100644 --- a/arch/arm/crypto/speck-neon-glue.c +++ b/arch/arm/crypto/speck-neon-glue.c @@ -157,7 +157,7 @@ __speck64_xts_crypt(struct skcipher_request *req, speck64_crypt_one_t crypt_one, struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); const struct speck64_xts_tfm_ctx *ctx = crypto_skcipher_ctx(tfm); struct skcipher_walk walk; - u64 tweak; + __le64 tweak; int err; err = skcipher_walk_virt(&walk, req, true); @@ -184,16 +184,16 @@ __speck64_xts_crypt(struct skcipher_request *req, speck64_crypt_one_t crypt_one, } /* Handle any remainder with generic code */ - while (nbytes >= sizeof(u64)) { - *(u64 *)dst = *(u64 *)src ^ tweak; + while (nbytes >
Re: [PATCH 14/14] x86/crypto: aesni: Update aesni-intel_glue to use scatter/gather
On 02/13/18 08:42 AM, Stephan Mueller wrote: > > +static int gcmaes_encrypt_sg(struct aead_request *req, unsigned int > > assoclen, + u8 *hash_subkey, u8 *iv, void *aes_ctx) > > +{ > > + struct crypto_aead *tfm = crypto_aead_reqtfm(req); > > + unsigned long auth_tag_len = crypto_aead_authsize(tfm); > > + struct gcm_context_data data AESNI_ALIGN_ATTR; > > + struct scatter_walk dst_sg_walk = {}; > > + unsigned long left = req->cryptlen; > > + unsigned long len, srclen, dstlen; > > + struct scatter_walk src_sg_walk; > > + struct scatterlist src_start[2]; > > + struct scatterlist dst_start[2]; > > + struct scatterlist *src_sg; > > + struct scatterlist *dst_sg; > > + u8 *src, *dst, *assoc; > > + u8 authTag[16]; > > + > > + assoc = kmalloc(assoclen, GFP_ATOMIC); > > + if (unlikely(!assoc)) > > + return -ENOMEM; > > + scatterwalk_map_and_copy(assoc, req->src, 0, assoclen, 0); > > Have you tested that this code does not barf when assoclen is 0? > > Maybe it is worth while to finally add a test vector to testmgr.h which > validates such scenario. If you would like, here is a vector you could add to > testmgr: > > https://github.com/smuellerDD/libkcapi/blob/master/test/test.sh#L315 I tested assoclen and cryptlen being 0 and it works, yes. Both kmalloc and scatterwalk_map_and_copy work correctly with 0 assoclen. > This is a decryption of gcm(aes) with no message, no AAD and just a tag. The > result should be EBADMSG. > > + > > + src_sg = scatterwalk_ffwd(src_start, req->src, req->assoclen); > > Why do you use assoclen in the map_and_copy, and req->assoclen in the ffwd? If I understand correctly, rfc4106 appends extra data after the assoc. assoclen is the real assoc length, req->assoclen is assoclen + the extra data length. So we ffwd by req->assoclen in the scatterlist, but use assoclen when memcpy and testing. > > > > +static int gcmaes_decrypt_sg(struct aead_request *req, unsigned int > > assoclen, + u8 *hash_subkey, u8 *iv, void *aes_ctx) > > +{ > > This is a lot of code duplication. I will merge them and send a V2. > Ciao > Stephan > > Thanks!
Re: [PATCH 14/14] x86/crypto: aesni: Update aesni-intel_glue to use scatter/gather
On 02/12/18 03:12 PM, Junaid Shahid wrote: > Hi Dave, > > > On 02/12/2018 11:51 AM, Dave Watson wrote: > > > +static int gcmaes_encrypt_sg(struct aead_request *req, unsigned int > > assoclen, > > + u8 *hash_subkey, u8 *iv, void *aes_ctx) > > > > +static int gcmaes_decrypt_sg(struct aead_request *req, unsigned int > > assoclen, > > + u8 *hash_subkey, u8 *iv, void *aes_ctx) > > These two functions are almost identical. Wouldn't it be better to combine > them into a single encrypt/decrypt function, similar to what you have done > for the assembly macros? > > > + if (((struct crypto_aes_ctx *)aes_ctx)->key_length != AES_KEYSIZE_128 || > > + aesni_gcm_enc_tfm == aesni_gcm_enc) { > > Shouldn't we also include a check for the buffer length being less than > AVX_GEN2_OPTSIZE? AVX will not be used in that case either. Yes, these both sound reasonable. I will send a V2. Thanks!
RE: [Crypto v4 03/12] support for inline tls
-Original Message- From: David Miller [mailto:da...@davemloft.net] Sent: Tuesday, February 13, 2018 1:19 AM To: Atul Gupta Cc: davejwat...@fb.com; herb...@gondor.apana.org.au; s...@queasysnail.net; linux-crypto@vger.kernel.org; net...@vger.kernel.org; Ganesh GR Subject: Re: [Crypto v4 03/12] support for inline tls From: Atul Gupta Date: Mon, 12 Feb 2018 17:34:28 +0530 > +static int get_tls_prot(struct sock *sk) { > + struct tls_context *ctx = tls_get_ctx(sk); > + struct net_device *netdev; > + struct tls_device *dev; > + > + /* Device bound to specific IP */ > + if (inet_sk(sk)->inet_rcv_saddr) { > + netdev = find_netdev(sk); > + if (!netdev) > + goto out; > + > + /* Device supports Inline record processing */ > + if (!(netdev->features & NETIF_F_HW_TLS_INLINE)) > + goto out; > + > + mutex_lock(&device_mutex); > + list_for_each_entry(dev, &device_list, dev_list) { > + if (dev->netdev && dev->netdev(dev, netdev)) > + break; > + } > + mutex_unlock(&device_mutex); > + > + ctx->tx_conf = TLS_FULL_HW; > + if (dev->prot) > + dev->prot(dev, sk); What if the same IP address is configured on multiple interfaces? Thanks, I overlooked this point. The checks above were based on the premise that device chosen is indeed the one with Inline TLS enabled, net_device corresponding to specific IP address, feature configured for device from ethtool and net_device corresponds to Inline TLS driver registered with net tls. Case with same IP configured on multiple interface looks similar to INADDR_ANY below. The TLS_FULL_HW and modified hash routines handles devices with/without Inline TLS support. The first Inline TLS capable device updates sk_prot for TLS_FULL_HW. The tls_hw_hash listens on all interfaces and process device specific routine, the listen however succeeds for device on which connect is initiated and may not have the Inline TLS capability, such connection continues in TLS_BASE_TX or non-tls-offload mode. On the other hand, if Inline TLS capable device were to establish connection it updates the prot as required for offload mode to continue. > + } else { /* src address not known or INADDR_ANY */ > + mutex_lock(&device_mutex); > + list_for_each_entry(dev, &device_list, dev_list) { > + if (dev->feature && dev->feature(dev)) { > + ctx->tx_conf = TLS_FULL_HW; > + break; > + } > + } > + mutex_unlock(&device_mutex); > + update_sk_prot(sk, ctx); And I think this is even more of a stretch. Just because you find an inline TLS device on the global list doesn't mean traffic will necessarily flow through it once the connection is fully established and therefore be able to provide inline TLS offloading.
Re: [PATCH v2 3/5] crypto: arm/speck - add NEON-accelerated implementation of Speck-XTS
Hi Eric, On 12 February 2018 at 23:52, Eric Biggers wrote: > Add an ARM NEON-accelerated implementation of Speck-XTS. It operates on > 128-byte chunks at a time, i.e. 8 blocks for Speck128 or 16 blocks for > Speck64. Each 128-byte chunk goes through XTS preprocessing, then is > encrypted/decrypted (doing one cipher round for all the blocks, then the > next round, etc.), then goes through XTS postprocessing. > > The performance depends on the processor but can be about 3 times faster > than the generic code. For example, on an ARMv7 processor we observe > the following performance with Speck128/256-XTS: > > xts-speck128-neon: Encryption 107.9 MB/s, Decryption 108.1 MB/s > xts(speck128-generic): Encryption 32.1 MB/s, Decryption 36.6 MB/s > > In comparison to AES-256-XTS without the Cryptography Extensions: > > xts-aes-neonbs:Encryption 41.2 MB/s, Decryption 36.7 MB/s > xts(aes-asm): Encryption 31.7 MB/s, Decryption 30.8 MB/s > xts(aes-generic): Encryption 21.2 MB/s, Decryption 20.9 MB/s > > Speck64/128-XTS is even faster: > > xts-speck64-neon: Encryption 138.6 MB/s, Decryption 139.1 MB/s > > Note that as with the generic code, only the Speck128 and Speck64 > variants are supported. Also, for now only the XTS mode of operation is > supported, to target the disk and file encryption use cases. The NEON > code also only handles the portion of the data that is evenly divisible > into 128-byte chunks, with any remainder handled by a C fallback. Of > course, other modes of operation could be added later if needed, and/or > the NEON code could be updated to handle other buffer sizes. > > The XTS specification is only defined for AES which has a 128-bit block > size, so for the GF(2^64) math needed for Speck64-XTS we use the > reducing polynomial 'x^64 + x^4 + x^3 + x + 1' given by the original XEX > paper. Of course, when possible users should use Speck128-XTS, but even > that may be too slow on some processors; Speck64-XTS can be faster. > I think this is excellent work. Speck seems an appropriate solution to this problem, and I'm glad we are not ending up with a stream cipher for block encryption. Also, I think an arm64 port would be nice. I may take a stab at this if nobody else beats me to it. I did run into an issue with this code though: On big-endian, I get [0.272381] alg: skcipher: Test 1 failed (invalid result) on encryption for xts-speck64-neon [0.276151] : 84 af 54 07 19 d4 7c a6 9c 8a ac f6 c2 14 04 d8 [0.278541] 0010: 7f 18 6c 43 56 ed 0b b3 92 21 a2 d9 17 59 e4 3b so there may be a byte order corner case you missed in the rewrite (or the issue existed before, as I did not test your v1) -- Ard. > Signed-off-by: Eric Biggers > --- > arch/arm/crypto/Kconfig | 6 + > arch/arm/crypto/Makefile | 2 + > arch/arm/crypto/speck-neon-core.S | 432 > ++ > arch/arm/crypto/speck-neon-glue.c | 290 + > 4 files changed, 730 insertions(+) > create mode 100644 arch/arm/crypto/speck-neon-core.S > create mode 100644 arch/arm/crypto/speck-neon-glue.c > > diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig > index b8e69fe282b8..925d1364727a 100644 > --- a/arch/arm/crypto/Kconfig > +++ b/arch/arm/crypto/Kconfig > @@ -121,4 +121,10 @@ config CRYPTO_CHACHA20_NEON > select CRYPTO_BLKCIPHER > select CRYPTO_CHACHA20 > > +config CRYPTO_SPECK_NEON > + tristate "NEON accelerated Speck cipher algorithms" > + depends on KERNEL_MODE_NEON > + select CRYPTO_BLKCIPHER > + select CRYPTO_SPECK > + > endif > diff --git a/arch/arm/crypto/Makefile b/arch/arm/crypto/Makefile > index 30ef8e291271..a758107c5525 100644 > --- a/arch/arm/crypto/Makefile > +++ b/arch/arm/crypto/Makefile > @@ -10,6 +10,7 @@ obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o > obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o > obj-$(CONFIG_CRYPTO_SHA512_ARM) += sha512-arm.o > obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha20-neon.o > +obj-$(CONFIG_CRYPTO_SPECK_NEON) += speck-neon.o > > ce-obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o > ce-obj-$(CONFIG_CRYPTO_SHA1_ARM_CE) += sha1-arm-ce.o > @@ -53,6 +54,7 @@ ghash-arm-ce-y:= ghash-ce-core.o ghash-ce-glue.o > crct10dif-arm-ce-y := crct10dif-ce-core.o crct10dif-ce-glue.o > crc32-arm-ce-y:= crc32-ce-core.o crc32-ce-glue.o > chacha20-neon-y := chacha20-neon-core.o chacha20-neon-glue.o > +speck-neon-y := speck-neon-core.o speck-neon-glue.o > > quiet_cmd_perl = PERL$@ >cmd_perl = $(PERL) $(<) > $(@) > diff --git a/arch/arm/crypto/speck-neon-core.S > b/arch/arm/crypto/speck-neon-core.S > new file mode 100644 > index ..3c1e203e53b9 > --- /dev/null > +++ b/arch/arm/crypto/speck-neon-core.S > @@ -0,0 +1,432 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * NEON-accelerated implementation of Speck128-XTS and Speck64-XTS > + * > + * Copyrigh
Re: [PATCH] crypto: arm/aes-cipher - move S-box to .rodata section
On 12 February 2018 at 13:52, Jinbum Park wrote: > Move the AES inverse S-box to the .rodata section > where it is safe from abuse by speculation. > > Signed-off-by: Jinbum Park Acked-by: Ard Biesheuvel > --- > arch/arm/crypto/aes-cipher-core.S | 19 ++- > 1 file changed, 10 insertions(+), 9 deletions(-) > > diff --git a/arch/arm/crypto/aes-cipher-core.S > b/arch/arm/crypto/aes-cipher-core.S > index 54b3840..184d6c2 100644 > --- a/arch/arm/crypto/aes-cipher-core.S > +++ b/arch/arm/crypto/aes-cipher-core.S > @@ -174,6 +174,16 @@ > .ltorg > .endm > > +ENTRY(__aes_arm_encrypt) > + do_cryptfround, crypto_ft_tab, crypto_ft_tab + 1, 2 > +ENDPROC(__aes_arm_encrypt) > + > + .align 5 > +ENTRY(__aes_arm_decrypt) > + do_cryptiround, crypto_it_tab, __aes_arm_inverse_sbox, 0 > +ENDPROC(__aes_arm_decrypt) > + > + .section".rodata", "a" > .align L1_CACHE_SHIFT > .type __aes_arm_inverse_sbox, %object > __aes_arm_inverse_sbox: > @@ -210,12 +220,3 @@ __aes_arm_inverse_sbox: > .byte 0x17, 0x2b, 0x04, 0x7e, 0xba, 0x77, 0xd6, 0x26 > .byte 0xe1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0c, 0x7d > .size __aes_arm_inverse_sbox, . - __aes_arm_inverse_sbox > - > -ENTRY(__aes_arm_encrypt) > - do_cryptfround, crypto_ft_tab, crypto_ft_tab + 1, 2 > -ENDPROC(__aes_arm_encrypt) > - > - .align 5 > -ENTRY(__aes_arm_decrypt) > - do_cryptiround, crypto_it_tab, __aes_arm_inverse_sbox, 0 > -ENDPROC(__aes_arm_decrypt) > -- > 1.9.1 >
Re: [PATCH] crypto: Fix incorrect values in PKCS#1 test vector
Am Dienstag, 13. Februar 2018, 09:29:56 CET schrieb Mcloughlin, Conor: Hi Conor, > The RSA private key for the first form should have > version, prime1, prime2, exponent1, exponent2, coefficient > values 0. > With non-zero values for prime1,2, exponent 1,2 and coefficient > the Intel QAT driver will assume that values are provided for the > private key second form. This will result in signature verification > failures for modules where QAT device is present and the modules > are signed with rsa,sha256. > > Cc: > Signed-off-by: Giovanni Cabiddu > Signed-off-by: Conor McLoughlin Reviewed-by: Stephan Mueller Ciao Stephan
[PATCH 3/8] crypto: inside-secure - fix the extra cache computation
This patch fixes the extra cache computation when the queued data is a multiple of a block size. This fixes the hash support in some cases. Fixes: 809778e02cd4 ("crypto: inside-secure - fix hash when length is a multiple of a block") Signed-off-by: Antoine Tenart --- drivers/crypto/inside-secure/safexcel_hash.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/crypto/inside-secure/safexcel_hash.c b/drivers/crypto/inside-secure/safexcel_hash.c index 122a2a58e98f..8fac23b78da5 100644 --- a/drivers/crypto/inside-secure/safexcel_hash.c +++ b/drivers/crypto/inside-secure/safexcel_hash.c @@ -198,7 +198,7 @@ static int safexcel_ahash_send_req(struct crypto_async_request *async, int ring, /* If this is not the last request and the queued data * is a multiple of a block, cache the last one for now. */ - extra = queued - crypto_ahash_blocksize(ahash); + extra = crypto_ahash_blocksize(ahash); if (extra) { sg_pcopy_to_buffer(areq->src, sg_nents(areq->src), -- 2.14.3
[PATCH 7/8] crypto: inside-secure - keep the requests push/pop synced
This patch updates the Inside Secure SafeXcel driver to avoid being out-of-sync between the number of requests sent and the one being completed. The number of requests acknowledged by the driver can be different than the threshold that was configured if new requests were being pushed to the h/w in the meantime. The driver wasn't taking those into account, and the number of remaining requests to handled (to reconfigure the interrupt threshold) could be out-of sync. This patch fixes it by not taking in account the number of requests left, but by taking in account the total number of requests being sent to the hardware, so that new requests are being taken into account. Fixes: dc7e28a3286e ("crypto: inside-secure - dequeue all requests at once") Suggested-by: Ofer Heifetz Signed-off-by: Antoine Tenart --- drivers/crypto/inside-secure/safexcel.c | 28 +--- drivers/crypto/inside-secure/safexcel.h | 6 ++ 2 files changed, 15 insertions(+), 19 deletions(-) diff --git a/drivers/crypto/inside-secure/safexcel.c b/drivers/crypto/inside-secure/safexcel.c index 0642d7181c9e..956a37692e42 100644 --- a/drivers/crypto/inside-secure/safexcel.c +++ b/drivers/crypto/inside-secure/safexcel.c @@ -432,20 +432,18 @@ static int safexcel_hw_init(struct safexcel_crypto_priv *priv) } /* Called with ring's lock taken */ -static int safexcel_try_push_requests(struct safexcel_crypto_priv *priv, - int ring, int reqs) +static void safexcel_try_push_requests(struct safexcel_crypto_priv *priv, + int ring) { - int coal = min_t(int, reqs, EIP197_MAX_BATCH_SZ); + int coal = min_t(int, priv->ring[ring].requests, EIP197_MAX_BATCH_SZ); if (!coal) - return 0; + return; /* Configure when we want an interrupt */ writel(EIP197_HIA_RDR_THRESH_PKT_MODE | EIP197_HIA_RDR_THRESH_PROC_PKT(coal), EIP197_HIA_RDR(priv, ring) + EIP197_HIA_xDR_THRESH); - - return coal; } void safexcel_dequeue(struct safexcel_crypto_priv *priv, int ring) @@ -521,13 +519,13 @@ void safexcel_dequeue(struct safexcel_crypto_priv *priv, int ring) spin_lock_bh(&priv->ring[ring].egress_lock); + priv->ring[ring].requests += nreq; + if (!priv->ring[ring].busy) { - nreq -= safexcel_try_push_requests(priv, ring, nreq); + safexcel_try_push_requests(priv, ring); priv->ring[ring].busy = true; } - priv->ring[ring].requests_left += nreq; - spin_unlock_bh(&priv->ring[ring].egress_lock); /* let the RDR know we have pending descriptors */ @@ -631,7 +629,7 @@ static inline void safexcel_handle_result_descriptor(struct safexcel_crypto_priv { struct safexcel_request *sreq; struct safexcel_context *ctx; - int ret, i, nreq, ndesc, tot_descs, done; + int ret, i, nreq, ndesc, tot_descs, handled = 0; bool should_complete; handle_results: @@ -667,6 +665,7 @@ static inline void safexcel_handle_result_descriptor(struct safexcel_crypto_priv kfree(sreq); tot_descs += ndesc; + handled++; } acknowledge: @@ -685,11 +684,10 @@ static inline void safexcel_handle_result_descriptor(struct safexcel_crypto_priv requests_left: spin_lock_bh(&priv->ring[ring].egress_lock); - done = safexcel_try_push_requests(priv, ring, - priv->ring[ring].requests_left); + priv->ring[ring].requests -= handled; + safexcel_try_push_requests(priv, ring); - priv->ring[ring].requests_left -= done; - if (!done && !priv->ring[ring].requests_left) + if (!priv->ring[ring].requests) priv->ring[ring].busy = false; spin_unlock_bh(&priv->ring[ring].egress_lock); @@ -970,7 +968,7 @@ static int safexcel_probe(struct platform_device *pdev) goto err_clk; } - priv->ring[i].requests_left = 0; + priv->ring[i].requests = 0; priv->ring[i].busy = false; crypto_init_queue(&priv->ring[i].queue, diff --git a/drivers/crypto/inside-secure/safexcel.h b/drivers/crypto/inside-secure/safexcel.h index 4e219c21608b..caaf6a81b162 100644 --- a/drivers/crypto/inside-secure/safexcel.h +++ b/drivers/crypto/inside-secure/safexcel.h @@ -551,10 +551,8 @@ struct safexcel_crypto_priv { struct crypto_queue queue; spinlock_t queue_lock; - /* Number of requests in the engine that needs the threshold -* interrupt to be set up. -*/ - int requests_left; + /* Number of requests in the engine. */ + int requests; /* The ring is currently handling at least one request */ bool busy; -- 2.14.3
[PATCH] crypto: Fix incorrect values in PKCS#1 test vector
The RSA private key for the first form should have version, prime1, prime2, exponent1, exponent2, coefficient values 0. With non-zero values for prime1,2, exponent 1,2 and coefficient the Intel QAT driver will assume that values are provided for the private key second form. This will result in signature verification failures for modules where QAT device is present and the modules are signed with rsa,sha256. Cc: Signed-off-by: Giovanni Cabiddu Signed-off-by: Conor McLoughlin --- crypto/testmgr.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/crypto/testmgr.h b/crypto/testmgr.h index 6044f69..69fb51e 100644 --- a/crypto/testmgr.h +++ b/crypto/testmgr.h @@ -548,7 +548,7 @@ struct kpp_testvec { static const struct akcipher_testvec pkcs1pad_rsa_tv_template[] = { { .key = - "\x30\x82\x03\x1f\x02\x01\x10\x02\x82\x01\x01\x00\xd7\x1e\x77\x82" + "\x30\x82\x03\x1f\x02\x01\x00\x02\x82\x01\x01\x00\xd7\x1e\x77\x82" "\x8c\x92\x31\xe7\x69\x02\xa2\xd5\x5c\x78\xde\xa2\x0c\x8f\xfe\x28" "\x59\x31\xdf\x40\x9c\x60\x61\x06\xb9\x2f\x62\x40\x80\x76\xcb\x67" "\x4a\xb5\x59\x56\x69\x17\x07\xfa\xf9\x4c\xbd\x6c\x37\x7a\x46\x7d" @@ -597,8 +597,8 @@ struct kpp_testvec { "\xfe\xf8\x27\x1b\xd6\x55\x60\x5e\x48\xb7\x6d\x9a\xa8\x37\xf9\x7a" "\xde\x1b\xcd\x5d\x1a\x30\xd4\xe9\x9e\x5b\x3c\x15\xf8\x9c\x1f\xda" "\xd1\x86\x48\x55\xce\x83\xee\x8e\x51\xc7\xde\x32\x12\x47\x7d\x46" - "\xb8\x35\xdf\x41\x02\x01\x30\x02\x01\x30\x02\x01\x30\x02\x01\x30" - "\x02\x01\x30", + "\xb8\x35\xdf\x41\x02\x01\x00\x02\x01\x00\x02\x01\x00\x02\x01\x00" + "\x02\x01\x00", .key_len = 804, /* * m is SHA256 hash of following message: -- 1.9.1 -- Intel Research and Development Ireland Limited Registered in Ireland Registered Office: Collinstown Industrial Park, Leixlip, County Kildare Registered Number: 308263 This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
[PATCH 6/8] crypto: inside-secure - fix the invalidation step during cra_exit
When exiting a transformation, the cra_exit() helper is called in each driver providing one. The Inside Secure SafeXcel driver has one, which is responsible of freeing some areas and of sending one invalidation request to the crypto engine, to invalidate the context that was used during the transformation. We could see in some setups (when lots of transformations were being used with a short lifetime, and hence lots of cra_exit() calls) NULL pointer dereferences and other weird issues. All these issues were coming from accessing the tfm context. The issue is the invalidation request completion is checked using a wait_for_completion_interruptible() call in both the cipher and hash cra_exit() helpers. In some cases this was interrupted while the invalidation request wasn't processed yet. And then cra_exit() returned, and its caller was freeing the tfm instance. Only then the request was being handled by the SafeXcel driver, which lead to the said issues. This patch fixes this by using wait_for_completion() calls in these specific cases. Fixes: 1b44c5a60c13 ("crypto: inside-secure - add SafeXcel EIP197 crypto engine driver") Signed-off-by: Antoine Tenart --- drivers/crypto/inside-secure/safexcel_cipher.c | 2 +- drivers/crypto/inside-secure/safexcel_hash.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/crypto/inside-secure/safexcel_cipher.c b/drivers/crypto/inside-secure/safexcel_cipher.c index 63a8768ed2ae..17a7725a6f6d 100644 --- a/drivers/crypto/inside-secure/safexcel_cipher.c +++ b/drivers/crypto/inside-secure/safexcel_cipher.c @@ -456,7 +456,7 @@ static int safexcel_cipher_exit_inv(struct crypto_tfm *tfm) queue_work(priv->ring[ring].workqueue, &priv->ring[ring].work_data.work); - wait_for_completion_interruptible(&result.completion); + wait_for_completion(&result.completion); if (result.error) { dev_warn(priv->dev, diff --git a/drivers/crypto/inside-secure/safexcel_hash.c b/drivers/crypto/inside-secure/safexcel_hash.c index 2951101e1831..43e94cd59c86 100644 --- a/drivers/crypto/inside-secure/safexcel_hash.c +++ b/drivers/crypto/inside-secure/safexcel_hash.c @@ -493,7 +493,7 @@ static int safexcel_ahash_exit_inv(struct crypto_tfm *tfm) queue_work(priv->ring[ring].workqueue, &priv->ring[ring].work_data.work); - wait_for_completion_interruptible(&result.completion); + wait_for_completion(&result.completion); if (result.error) { dev_warn(priv->dev, "hash: completion error (%d)\n", -- 2.14.3
[PATCH 0/8] crypto: inside-secure - stabilization and fixes
Hi Herbert, We spent a lot of time with Ofer to test various use cases of the Inside Secure driver. We performed many tests in addition to the crypto subsystem ones (IPsec, openssl speed, tcrypt...). As a result the driver is a lot more stable in various configurations, with this series applied. The fixes of this series aren't all related, but were all found while performing these stabilization process and tests. Also as the company I'm working at changed its name from Free Electrons to Bootlin, the first patch updates my email address in the MAINTAINERS file. The series is based on v4.16-rc1. Thanks! Antoine Antoine Tenart (8): MAINTAINERS: update the Inside Secure maintainer email crypto: inside-secure - do not overwrite the threshold value crypto: inside-secure - fix the extra cache computation crypto: inside-secure - fix the cache_len computation crypto: inside-secure - do not process request if no command was issued crypto: inside-secure - fix the invalidation step during cra_exit crypto: inside-secure - keep the requests push/pop synced crypto: inside-secure - unmap the result in the hash send error path MAINTAINERS| 2 +- drivers/crypto/inside-secure/safexcel.c| 40 +++--- drivers/crypto/inside-secure/safexcel.h| 6 ++-- drivers/crypto/inside-secure/safexcel_cipher.c | 2 +- drivers/crypto/inside-secure/safexcel_hash.c | 10 --- 5 files changed, 33 insertions(+), 27 deletions(-) -- 2.14.3
[PATCH 2/8] crypto: inside-secure - do not overwrite the threshold value
This patch fixes the Inside Secure SafeXcel driver not to overwrite the interrupt threshold value. In certain cases the value of this register, which controls when to fire an interrupt, was overwritten. This lead to packet not being processed or acked as the driver never was aware of their completion. This patch fixes this behaviour by not setting the threshold when requests are being processed by the engine. Fixes: dc7e28a3286e ("crypto: inside-secure - dequeue all requests at once") Suggested-by: Ofer Heifetz Signed-off-by: Antoine Tenart --- drivers/crypto/inside-secure/safexcel.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/crypto/inside-secure/safexcel.c b/drivers/crypto/inside-secure/safexcel.c index 225e74a7f724..5cb90bcd3f18 100644 --- a/drivers/crypto/inside-secure/safexcel.c +++ b/drivers/crypto/inside-secure/safexcel.c @@ -514,8 +514,7 @@ void safexcel_dequeue(struct safexcel_crypto_priv *priv, int ring) if (!priv->ring[ring].busy) { nreq -= safexcel_try_push_requests(priv, ring, nreq); - if (nreq) - priv->ring[ring].busy = true; + priv->ring[ring].busy = true; } priv->ring[ring].requests_left += nreq; -- 2.14.3
[PATCH 8/8] crypto: inside-secure - unmap the result in the hash send error path
This patch adds a label to unmap the result buffer in the hash send function error path. Fixes: 1b44c5a60c13 ("crypto: inside-secure - add SafeXcel EIP197 crypto engine driver") Suggested-by: Ofer Heifetz Signed-off-by: Antoine Tenart --- drivers/crypto/inside-secure/safexcel_hash.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/crypto/inside-secure/safexcel_hash.c b/drivers/crypto/inside-secure/safexcel_hash.c index 43e94cd59c86..a4960a934eef 100644 --- a/drivers/crypto/inside-secure/safexcel_hash.c +++ b/drivers/crypto/inside-secure/safexcel_hash.c @@ -303,7 +303,7 @@ static int safexcel_ahash_send_req(struct crypto_async_request *async, int ring, req->state_sz); if (IS_ERR(rdesc)) { ret = PTR_ERR(rdesc); - goto cdesc_rollback; + goto unmap_result; } spin_unlock_bh(&priv->ring[ring].egress_lock); @@ -315,6 +315,8 @@ static int safexcel_ahash_send_req(struct crypto_async_request *async, int ring, *results = 1; return 0; +unmap_result: + dma_unmap_sg(priv->dev, areq->src, req->nents, DMA_TO_DEVICE); cdesc_rollback: for (i = 0; i < n_cdesc; i++) safexcel_ring_rollback_wptr(priv, &priv->ring[ring].cdr); -- 2.14.3
[PATCH 4/8] crypto: inside-secure - fix the cache_len computation
This patch fixes the cache length computation as cache_len could end up being a negative value. The check between the queued size and the block size is updated to reflect the caching mechanism which can cache up to a full block size (included!). Fixes: 809778e02cd4 ("crypto: inside-secure - fix hash when length is a multiple of a block") Signed-off-by: Antoine Tenart --- drivers/crypto/inside-secure/safexcel_hash.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/crypto/inside-secure/safexcel_hash.c b/drivers/crypto/inside-secure/safexcel_hash.c index 8fac23b78da5..2951101e1831 100644 --- a/drivers/crypto/inside-secure/safexcel_hash.c +++ b/drivers/crypto/inside-secure/safexcel_hash.c @@ -184,7 +184,7 @@ static int safexcel_ahash_send_req(struct crypto_async_request *async, int ring, int i, queued, len, cache_len, extra, n_cdesc = 0, ret = 0; queued = len = req->len - req->processed; - if (queued < crypto_ahash_blocksize(ahash)) + if (queued <= crypto_ahash_blocksize(ahash)) cache_len = queued; else cache_len = queued - areq->nbytes; -- 2.14.3
[PATCH 1/8] MAINTAINERS: update the Inside Secure maintainer email
Free Electrons became Bootlin. Update my email accordingly. Signed-off-by: Antoine Tenart --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 3bdc260e36b7..d0f6811b9eed 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -6990,7 +6990,7 @@ F:drivers/input/input-mt.c K: \b(ABS|SYN)_MT_ INSIDE SECURE CRYPTO DRIVER -M: Antoine Tenart +M: Antoine Tenart F: drivers/crypto/inside-secure/ S: Maintained L: linux-crypto@vger.kernel.org -- 2.14.3
[PATCH 5/8] crypto: inside-secure - do not process request if no command was issued
This patch adds a check in the SafeXcel dequeue function, to avoid processing request further if no hardware command was issued. This can happen in certain cases where the ->send() function caches all the data that would have been send. Fixes: 809778e02cd4 ("crypto: inside-secure - fix hash when length is a multiple of a block") Signed-off-by: Antoine Tenart --- drivers/crypto/inside-secure/safexcel.c | 9 + 1 file changed, 9 insertions(+) diff --git a/drivers/crypto/inside-secure/safexcel.c b/drivers/crypto/inside-secure/safexcel.c index 5cb90bcd3f18..0642d7181c9e 100644 --- a/drivers/crypto/inside-secure/safexcel.c +++ b/drivers/crypto/inside-secure/safexcel.c @@ -490,6 +490,15 @@ void safexcel_dequeue(struct safexcel_crypto_priv *priv, int ring) if (backlog) backlog->complete(backlog, -EINPROGRESS); + /* In case the send() helper did not issue any command to push +* to the engine because the input data was cached, continue to +* dequeue other requests as this is valid and not an error. +*/ + if (!commands && !results) { + kfree(request); + continue; + } + spin_lock_bh(&priv->ring[ring].egress_lock); list_add_tail(&request->list, &priv->ring[ring].list); spin_unlock_bh(&priv->ring[ring].egress_lock); -- 2.14.3