Re: [PATCH V3 2/6] crypto/nx: Create nx842_configure_crb function

2017-08-02 Thread Herbert Xu
On Fri, Jul 21, 2017 at 09:57:39PM -0700, Haren Myneni wrote:
> 
> Configure CRB is moved to nx842_configure_crb() so that it can
> be used for icswx and VAS exec functions. VAS function will be
> added later with P9 support.
> 
> Signed-off-by: Haren Myneni 

Your patch does not apply against cryptodev.  Please rebase.

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [RESEND,PATCH v4 3/3] crypto : stm32 - Add STM32F4 CRC32 support

2017-08-02 Thread Herbert Xu
On Mon, Jul 17, 2017 at 11:27:36AM +0300, Cosar Dindar wrote:
> This patch adds CRC (CRC32 Crypto) support for STM32F4 series.
> 
> As an hardware limitation polynomial and key setting are not supported.
> They are fixed as 0x4C11DB7 (poly) and 0x (key).
> CRC32C Castagnoli algorithm is not used.
> 
> Signed-off-by: Cosar Dindar 
> Reviewed-by: Fabien Dessenne 

This patch doesn't apply anymore.  Please rebase.

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH V7 5/7] crypto: AES CBC multi-buffer glue code

2017-08-02 Thread Herbert Xu
On Tue, Jul 25, 2017 at 07:09:58PM -0700, Megha Dey wrote:
>
> +/* notify the caller of progress ; request still stays in queue */
> +
> +static void notify_callback(struct mcryptd_skcipher_request_ctx *rctx,
> + struct mcryptd_alg_cstate *cstate,
> + int err)
> +{
> + struct skcipher_request *req = cast_mcryptd_ctx_to_req(rctx);
> +
> + local_bh_disable();
> + rctx->complete(>base, err);
> + local_bh_enable();
> +}

Please explain why you have this crazy construct that does async
operations behind the crypto API's back while pretending to be sync
by always returning zero?

Why is this even needed now that you have switched the underlying
implementation to be async?

> + /* from mcryptd, we need to callback */
> + if (irqs_disabled())
> + rctx->complete(>base, err);
> + else {
> + local_bh_disable();
> + rctx->complete(>base, err);
> + local_bh_enable();
> + }

I complained about this the first time around and yet this crap is
still there.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH resend 00/18] crypto: ARM/arm64 roundup for v4.14

2017-08-02 Thread Herbert Xu
On Wed, Aug 02, 2017 at 03:46:16PM +0100, Dave Martin wrote:
> Hi Herbert,
> 
> This series from Ard is a prerequisite for an arm64 series [1] that I'd
> like to get merged this cycle (because it is in turn a prerequisite for
> another major series I want to progress).
> 
> [1] without this series will break the kernel, whereas this series
> without [1] won't break the kernel, but will cause performance
> regressions in the arm64 crypto code due to unnecessary execution of C
> fallbacks.
> 
> So it would be good to get both merged this cycle.
> 
> Can Ard's series be merged for v4.14, do you think?

I don't see any issues with this making 4.14.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: Request for Comments about Chained-IV feature in Linux crypto framework

2017-08-02 Thread Herbert Xu
On Thu, Aug 03, 2017 at 01:12:32AM +, Yu, Wenqian wrote:
> Hi, Herbert and all,
> 
> For saving the offload cost of symmetric cipher to hardware accelerator, we 
> have a proposal (chained-IV) to batch multiple SG with different IV into one 
> skcipher request, which also benefits SW implementation. The existing 
> skcipher with SG list in crypto framework is treating all SG in the SG list 
> as one single buffer to symmetric crypto operation with same IV.  In some use 
> case the IV for each SG is different (e.g. dm-crypt, the IV for each sector 
> is different). Could you please give quick comments on the below proposal 
> before implementation?

We've had this discussion already.  The current plan is to use
explicit IV generators, as seen here

https://patchwork.kernel.org/patch/9803473/

So please help review the existing patches and if there are any
inadequacies, please comment.

Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH v1] crypto: caam - set hwrng quality level

2017-08-02 Thread Herbert Xu
On Wed, Aug 02, 2017 at 02:03:14PM +, Horia Geantă wrote:
>
> Take CAAM's engine HWRNG: it can work both as a TRNG and as a
> TRNG-seeded DRBG (that's how it's currently configured).
> IIUC, both setups are fit as source for the entropy pool.

So which is it? If it's a DRBG then it should not be exposed through
the hwrng interface.  Only TRNG should go through hwrng.  DRBGs
can use the crypto rng API.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


RE: Request for Comments about Chained-IV feature in Linux crypto framework

2017-08-02 Thread Yu, Wenqian
Format it to plaintext and resend it as HTML subpart is treated as SPAM or 
Outlook Virus by the system.

Thanks,
- Wenqian


From: Yu, Wenqian 
Sent: Thursday, August 3, 2017 9:13 AM
To: linux-crypto@vger.kernel.org
Cc: herb...@gondor.apana.org.au; dm-de...@redhat.com; m-cr...@saout.de; Milan 
Broz ; Keating, Brian A ; Will, 
Brian ; Li, Weigang ; Cabiddu, 
Giovanni ; Yu, Wenqian 
Subject: Request for Comments about Chained-IV feature in Linux crypto framework

Hi, Herbert and all,

For saving the offload cost of symmetric cipher to hardware accelerator, we 
have a proposal (chained-IV) to batch multiple SG with different IV into one 
skcipher request, which also benefits SW implementation. The existing skcipher 
with SG list in crypto framework is treating all SG in the SG list as one 
single buffer to symmetric crypto operation with same IV.  In some use case the 
IV for each SG is different (e.g. dm-crypt, the IV for each sector is 
different). Could you please give quick comments on the below proposal before 
implementation?

1) Add a new flag CRYPTO_TFM_REQ_CHAINED_IV for chained IV request.

2) Reuse the existing iv in skcipher_request structure to include all the IVs 
for each different sg with agreement that the first block size length data of 
IV is for the data in first sg and so on. Same as aead_request.
struct skcipher_request {
    unsigned int cryptlen;
    u8 *iv;

    struct scatterlist *src;
    struct scatterlist *dst;

    struct crypto_async_request base;

    void *__ctx[] CRYPTO_MINALIGN_ATTR;
};

3) No API change but implementation change required inside of skcipher API to 
handle chained-IV support and API should inforce the same number of IV elements 
as that of SG elements.

Thanks,
- Wenqian



Re: [PATCH v3 7/7] ima: Support module-style appended signatures for appraisal

2017-08-02 Thread Mimi Zohar
On Wed, 2017-08-02 at 14:42 -0300, Thiago Jung Bauermann wrote:
> Mimi Zohar  writes:
> 
> > On Thu, 2017-07-06 at 19:17 -0300, Thiago Jung Bauermann wrote:
> >> --- a/security/integrity/ima/ima_appraise.c
> >> +++ b/security/integrity/ima/ima_appraise.c
> >> @@ -200,18 +200,40 @@ int ima_read_xattr(struct dentry *dentry,
> >>   */
> >>  int ima_appraise_measurement(enum ima_hooks func,
> >> struct integrity_iint_cache *iint,
> >> -   struct file *file, const unsigned char *filename,
> >> -   struct evm_ima_xattr_data *xattr_value,
> >> -   int xattr_len, int opened)
> >> +   struct file *file, const void *buf, loff_t size,
> >> +   const unsigned char *filename,
> >> +   struct evm_ima_xattr_data **xattr_value_,
> >> +   int *xattr_len_, int opened)
> >>  {
> >>static const char op[] = "appraise_data";
> >>char *cause = "unknown";
> >>struct dentry *dentry = file_dentry(file);
> >>struct inode *inode = d_backing_inode(dentry);
> >>enum integrity_status status = INTEGRITY_UNKNOWN;
> >> -  int rc = xattr_len, hash_start = 0;
> >> +  struct evm_ima_xattr_data *xattr_value = *xattr_value_;
> >> +  int xattr_len = *xattr_len_, rc = xattr_len, hash_start = 0;
> >> +  bool appraising_modsig = false;
> >> +  void *xattr_value_evm;
> >> +  size_t xattr_len_evm;
> >> +
> >> +  if (iint->flags & IMA_MODSIG_ALLOWED) {
> >> +  /*
> >> +   * Not supposed to happen. Hooks that support modsig are
> >> +   * whitelisted when parsing the policy using
> >> +   * ima_hooks_supports_modsig.
> >> +   */
> >> +  if (!buf || !size)
> >> +  WARN_ONCE(true, "%s doesn't support modsig\n",
> >> +func_tokens[func]);
> >
> > ima _appraise_measurement() is getting kind of long. Is there any
> > reason we can't move this comment and test to ima_read_modsig()?
> 
> I didn't do that because then I would need to pass func as an argument
> to ima_read_modsig just to print the warning above. But it does simplify
> the code so it may be worth it. I'll make that change in v4.

Makes sense.

> >> @@ -229,8 +251,24 @@ int ima_appraise_measurement(enum ima_hooks func,
> >>goto out;
> >>}
> >> 
> >> -  status = evm_verifyxattr(dentry, XATTR_NAME_IMA, xattr_value, rc, iint);
> >> -  if ((status != INTEGRITY_PASS) && (status != INTEGRITY_UNKNOWN)) {
> >> +  /*
> >> +   * Appended signatures aren't protected by EVM but we still call
> >> +   * evm_verifyxattr to check other security xattrs, if they exist.
> >> +   */
> >> +  if (appraising_modsig) {
> >> +  xattr_value_evm = NULL;
> >> +  xattr_len_evm = 0;
> >> +  } else {
> >> +  xattr_value_evm = xattr_value;
> >> +  xattr_len_evm = xattr_len;
> >> +  }
> >> +
> >> +  status = evm_verifyxattr(dentry, XATTR_NAME_IMA, xattr_value_evm,
> >> +   xattr_len_evm, iint);
> >> +  if (appraising_modsig && status == INTEGRITY_FAIL) {
> >> +  cause = "invalid-HMAC";
> >> +  goto out;
> >
> > "modsig" is special, because having any security xattrs is not
> > required. This test doesn't prevent status from being set to
> > "missing-HMAC". This test is redundant with the original tests below.
> 
> Indeed, that is wrong. I'm still a bit fuzzy about how EVM works and how
> it interacts with IMA. The only way I can think of singling out modsig
> without reintroduced the complex expression you didn't like in v2 is as
> below. What do you think?

The original code, without any extra tests, should be fine.

> 
> @@ -229,8 +241,25 @@ int ima_appraise_measurement(enum ima_hooks func,
>   goto out;
>   }
> 
> - status = evm_verifyxattr(dentry, XATTR_NAME_IMA, xattr_value, rc, iint);
> - if ((status != INTEGRITY_PASS) && (status != INTEGRITY_UNKNOWN)) {
> + /*
> +  * Appended signatures aren't protected by EVM but we still call
> +  * evm_verifyxattr to check other security xattrs, if they exist.
> +  */
> + if (appraising_modsig) {
> + xattr_value_evm = NULL;
> + xattr_len_evm = 0;
> + } else {
> + xattr_value_evm = xattr_value;
> + xattr_len_evm = xattr_len;
> + }
> +
> + status = evm_verifyxattr(dentry, XATTR_NAME_IMA, xattr_value_evm,
> +  xattr_len_evm, iint);
> + if (appraising_modsig && (status == INTEGRITY_NOLABEL
> +   || status == INTEGRITY_NOXATTRS))
> + /* It's ok if there's no xattr in the case of modsig. */
> + ;
> + else if (status != INTEGRITY_PASS && status != INTEGRITY_UNKNOWN) {
>   if ((status == INTEGRITY_NOLABEL)
>   || (status == INTEGRITY_NOXATTRS))
>   cause = "missing-HMAC";
> 

[Patch V4] crypto: x86/sha1 : Fix reads beyond the number of blocks passed

2017-08-02 Thread Megha Dey
It was reported that the sha1 AVX2 function(sha1_transform_avx2) is
reading ahead beyond its intended data, and causing a crash if the next
block is beyond page boundary:
http://marc.info/?l=linux-crypto-vger=149373371023377

This patch makes sure that there is no overflow for any buffer length.

It passes the tests written by Jan Stancek that revealed this problem:
https://github.com/jstancek/sha1-avx2-crash

I have re-enabled sha1-avx2 by reverting commit
b82ce24426a4071da9529d726057e4e642948667

Originally-by: Ilya Albrekht 
Tested-by: Jan Stancek 
Signed-off-by: Megha Dey 
Reported-by: Jan Stancek 
---
 arch/x86/crypto/sha1_avx2_x86_64_asm.S | 67 ++
 arch/x86/crypto/sha1_ssse3_glue.c  |  2 +-
 2 files changed, 37 insertions(+), 32 deletions(-)

diff --git a/arch/x86/crypto/sha1_avx2_x86_64_asm.S 
b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
index 1cd792d..1eab79c 100644
--- a/arch/x86/crypto/sha1_avx2_x86_64_asm.S
+++ b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
@@ -117,11 +117,10 @@
.set T1, REG_T1
 .endm
 
-#define K_BASE %r8
 #define HASH_PTR   %r9
+#define BLOCKS_CTR %r8
 #define BUFFER_PTR %r10
 #define BUFFER_PTR2%r13
-#define BUFFER_END %r11
 
 #define PRECALC_BUF%r14
 #define WK_BUF %r15
@@ -205,14 +204,14 @@
 * blended AVX2 and ALU instruction scheduling
 * 1 vector iteration per 8 rounds
 */
-   vmovdqu ((i * 2) + PRECALC_OFFSET)(BUFFER_PTR), W_TMP
+   vmovdqu (i * 2)(BUFFER_PTR), W_TMP
.elseif ((i & 7) == 1)
-   vinsertf128 $1, (((i-1) * 2)+PRECALC_OFFSET)(BUFFER_PTR2),\
+   vinsertf128 $1, ((i-1) * 2)(BUFFER_PTR2),\
 WY_TMP, WY_TMP
.elseif ((i & 7) == 2)
vpshufb YMM_SHUFB_BSWAP, WY_TMP, WY
.elseif ((i & 7) == 4)
-   vpaddd  K_XMM(K_BASE), WY, WY_TMP
+   vpaddd  K_XMM + K_XMM_AR(%rip), WY, WY_TMP
.elseif ((i & 7) == 7)
vmovdqu  WY_TMP, PRECALC_WK(i&~7)
 
@@ -255,7 +254,7 @@
vpxor   WY, WY_TMP, WY_TMP
.elseif ((i & 7) == 7)
vpxor   WY_TMP2, WY_TMP, WY
-   vpaddd  K_XMM(K_BASE), WY, WY_TMP
+   vpaddd  K_XMM + K_XMM_AR(%rip), WY, WY_TMP
vmovdqu WY_TMP, PRECALC_WK(i&~7)
 
PRECALC_ROTATE_WY
@@ -291,7 +290,7 @@
vpsrld  $30, WY, WY
vporWY, WY_TMP, WY
.elseif ((i & 7) == 7)
-   vpaddd  K_XMM(K_BASE), WY, WY_TMP
+   vpaddd  K_XMM + K_XMM_AR(%rip), WY, WY_TMP
vmovdqu WY_TMP, PRECALC_WK(i&~7)
 
PRECALC_ROTATE_WY
@@ -446,6 +445,16 @@
 
 .endm
 
+/* Add constant only if (%2 > %3) condition met (uses RTA as temp)
+ * %1 + %2 >= %3 ? %4 : 0
+ */
+.macro ADD_IF_GE a, b, c, d
+   mov \a, RTA
+   add $\d, RTA
+   cmp $\c, \b
+   cmovge  RTA, \a
+.endm
+
 /*
  * macro implements 80 rounds of SHA-1, for multiple blocks with s/w pipelining
  */
@@ -463,13 +472,16 @@
lea (2*4*80+32)(%rsp), WK_BUF
 
# Precalc WK for first 2 blocks
-   PRECALC_OFFSET = 0
+   ADD_IF_GE BUFFER_PTR2, BLOCKS_CTR, 2, 64
.set i, 0
.rept160
PRECALC i
.set i, i + 1
.endr
-   PRECALC_OFFSET = 128
+
+   /* Go to next block if needed */
+   ADD_IF_GE BUFFER_PTR, BLOCKS_CTR, 3, 128
+   ADD_IF_GE BUFFER_PTR2, BLOCKS_CTR, 4, 128
xchgWK_BUF, PRECALC_BUF
 
.align 32
@@ -479,8 +491,8 @@ _loop:
 * we use K_BASE value as a signal of a last block,
 * it is set below by: cmovae BUFFER_PTR, K_BASE
 */
-   cmp K_BASE, BUFFER_PTR
-   jne _begin
+   test BLOCKS_CTR, BLOCKS_CTR
+   jnz _begin
.align 32
jmp _end
.align 32
@@ -512,10 +524,10 @@ _loop0:
.set j, j+2
.endr
 
-   add $(2*64), BUFFER_PTR   /* move to next odd-64-byte block */
-   cmp BUFFER_END, BUFFER_PTR/* is current block the last one? */
-   cmovae  K_BASE, BUFFER_PTR  /* signal the last iteration smartly */
-
+   /* Update Counter */
+   sub $1, BLOCKS_CTR
+   /* Move to the next block only if needed*/
+   ADD_IF_GE BUFFER_PTR, BLOCKS_CTR, 4, 128
/*
 * rounds
 * 60,62,64,66,68
@@ -532,8 +544,8 @@ _loop0:
UPDATE_HASH 12(HASH_PTR), D
UPDATE_HASH 16(HASH_PTR), E
 
-   cmp K_BASE, BUFFER_PTR  /* is current block the last one? */
-   je  _loop
+   testBLOCKS_CTR, BLOCKS_CTR
+   jz  _loop
 
mov TB, B
 
@@ -575,10 +587,10 @@ _loop2:
.set j, j+2
.endr
 
-   add $(2*64), BUFFER_PTR2  /* move to next even-64-byte 

Re: [Patch V3] crypto: x86/sha1 : Fix reads beyond the number of blocks passed

2017-08-02 Thread Jan Stancek



- Original Message -
> On Wed, 2017-08-02 at 10:13 -0700, Megha Dey wrote:
> > It was reported that the sha1 AVX2 function(sha1_transform_avx2) is
> > reading ahead beyond its intended data, and causing a crash if the next
> > block is beyond page boundary:
> > http://marc.info/?l=linux-crypto-vger=149373371023377
> > 
> > This patch makes sure that there is no overflow for any buffer length.
> > 
> > It passes the tests written by Jan Stancek that revealed this problem:
> > https://github.com/jstancek/sha1-avx2-crash
> 
> Hi Jan,
> 
> Is it ok to add your Tested-by?

Yes, v3 patch is exactly the diff I was testing.

Regards,
Jan


Re: [Patch V3] crypto: x86/sha1 : Fix reads beyond the number of blocks passed

2017-08-02 Thread Megha Dey
On Wed, 2017-08-02 at 10:13 -0700, Megha Dey wrote:
> It was reported that the sha1 AVX2 function(sha1_transform_avx2) is
> reading ahead beyond its intended data, and causing a crash if the next
> block is beyond page boundary:
> http://marc.info/?l=linux-crypto-vger=149373371023377
> 
> This patch makes sure that there is no overflow for any buffer length.
> 
> It passes the tests written by Jan Stancek that revealed this problem:
> https://github.com/jstancek/sha1-avx2-crash

Hi Jan,

Is it ok to add your Tested-by?

> 
> I have re-enabled sha1-avx2 by reverting commit
> b82ce24426a4071da9529d726057e4e642948667
> 
> Originally-by: Ilya Albrekht 
> Signed-off-by: Megha Dey 
> Reported-by: Jan Stancek 
> ---
>  arch/x86/crypto/sha1_avx2_x86_64_asm.S | 67 
> ++
>  arch/x86/crypto/sha1_ssse3_glue.c  |  2 +-
>  2 files changed, 37 insertions(+), 32 deletions(-)
> 
> diff --git a/arch/x86/crypto/sha1_avx2_x86_64_asm.S 
> b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
> index 1cd792d..1eab79c 100644
> --- a/arch/x86/crypto/sha1_avx2_x86_64_asm.S
> +++ b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
> @@ -117,11 +117,10 @@
>   .set T1, REG_T1
>  .endm
>  
> -#define K_BASE   %r8
>  #define HASH_PTR %r9
> +#define BLOCKS_CTR   %r8
>  #define BUFFER_PTR   %r10
>  #define BUFFER_PTR2  %r13
> -#define BUFFER_END   %r11
>  
>  #define PRECALC_BUF  %r14
>  #define WK_BUF   %r15
> @@ -205,14 +204,14 @@
>* blended AVX2 and ALU instruction scheduling
>* 1 vector iteration per 8 rounds
>*/
> - vmovdqu ((i * 2) + PRECALC_OFFSET)(BUFFER_PTR), W_TMP
> + vmovdqu (i * 2)(BUFFER_PTR), W_TMP
>   .elseif ((i & 7) == 1)
> - vinsertf128 $1, (((i-1) * 2)+PRECALC_OFFSET)(BUFFER_PTR2),\
> + vinsertf128 $1, ((i-1) * 2)(BUFFER_PTR2),\
>WY_TMP, WY_TMP
>   .elseif ((i & 7) == 2)
>   vpshufb YMM_SHUFB_BSWAP, WY_TMP, WY
>   .elseif ((i & 7) == 4)
> - vpaddd  K_XMM(K_BASE), WY, WY_TMP
> + vpaddd  K_XMM + K_XMM_AR(%rip), WY, WY_TMP
>   .elseif ((i & 7) == 7)
>   vmovdqu  WY_TMP, PRECALC_WK(i&~7)
>  
> @@ -255,7 +254,7 @@
>   vpxor   WY, WY_TMP, WY_TMP
>   .elseif ((i & 7) == 7)
>   vpxor   WY_TMP2, WY_TMP, WY
> - vpaddd  K_XMM(K_BASE), WY, WY_TMP
> + vpaddd  K_XMM + K_XMM_AR(%rip), WY, WY_TMP
>   vmovdqu WY_TMP, PRECALC_WK(i&~7)
>  
>   PRECALC_ROTATE_WY
> @@ -291,7 +290,7 @@
>   vpsrld  $30, WY, WY
>   vporWY, WY_TMP, WY
>   .elseif ((i & 7) == 7)
> - vpaddd  K_XMM(K_BASE), WY, WY_TMP
> + vpaddd  K_XMM + K_XMM_AR(%rip), WY, WY_TMP
>   vmovdqu WY_TMP, PRECALC_WK(i&~7)
>  
>   PRECALC_ROTATE_WY
> @@ -446,6 +445,16 @@
>  
>  .endm
>  
> +/* Add constant only if (%2 > %3) condition met (uses RTA as temp)
> + * %1 + %2 >= %3 ? %4 : 0
> + */
> +.macro ADD_IF_GE a, b, c, d
> + mov \a, RTA
> + add $\d, RTA
> + cmp $\c, \b
> + cmovge  RTA, \a
> +.endm
> +
>  /*
>   * macro implements 80 rounds of SHA-1, for multiple blocks with s/w 
> pipelining
>   */
> @@ -463,13 +472,16 @@
>   lea (2*4*80+32)(%rsp), WK_BUF
>  
>   # Precalc WK for first 2 blocks
> - PRECALC_OFFSET = 0
> + ADD_IF_GE BUFFER_PTR2, BLOCKS_CTR, 2, 64
>   .set i, 0
>   .rept160
>   PRECALC i
>   .set i, i + 1
>   .endr
> - PRECALC_OFFSET = 128
> +
> + /* Go to next block if needed */
> + ADD_IF_GE BUFFER_PTR, BLOCKS_CTR, 3, 128
> + ADD_IF_GE BUFFER_PTR2, BLOCKS_CTR, 4, 128
>   xchgWK_BUF, PRECALC_BUF
>  
>   .align 32
> @@ -479,8 +491,8 @@ _loop:
>* we use K_BASE value as a signal of a last block,
>* it is set below by: cmovae BUFFER_PTR, K_BASE
>*/
> - cmp K_BASE, BUFFER_PTR
> - jne _begin
> + test BLOCKS_CTR, BLOCKS_CTR
> + jnz _begin
>   .align 32
>   jmp _end
>   .align 32
> @@ -512,10 +524,10 @@ _loop0:
>   .set j, j+2
>   .endr
>  
> - add $(2*64), BUFFER_PTR   /* move to next odd-64-byte block */
> - cmp BUFFER_END, BUFFER_PTR/* is current block the last one? */
> - cmovae  K_BASE, BUFFER_PTR  /* signal the last iteration smartly */
> -
> + /* Update Counter */
> + sub $1, BLOCKS_CTR
> + /* Move to the next block only if needed*/
> + ADD_IF_GE BUFFER_PTR, BLOCKS_CTR, 4, 128
>   /*
>* rounds
>* 60,62,64,66,68
> @@ -532,8 +544,8 @@ _loop0:
>   UPDATE_HASH 12(HASH_PTR), D
>   UPDATE_HASH 16(HASH_PTR), E
>  
> - cmp K_BASE, BUFFER_PTR  /* is current block the last one? */
> - je  _loop
> + testBLOCKS_CTR, BLOCKS_CTR
> +  

Re: [PATCH v3 7/7] ima: Support module-style appended signatures for appraisal

2017-08-02 Thread Thiago Jung Bauermann

Mimi Zohar  writes:

> On Thu, 2017-07-06 at 19:17 -0300, Thiago Jung Bauermann wrote:
>> --- a/security/integrity/ima/ima_appraise.c
>> +++ b/security/integrity/ima/ima_appraise.c
>> @@ -200,18 +200,40 @@ int ima_read_xattr(struct dentry *dentry,
>>   */
>>  int ima_appraise_measurement(enum ima_hooks func,
>>   struct integrity_iint_cache *iint,
>> - struct file *file, const unsigned char *filename,
>> - struct evm_ima_xattr_data *xattr_value,
>> - int xattr_len, int opened)
>> + struct file *file, const void *buf, loff_t size,
>> + const unsigned char *filename,
>> + struct evm_ima_xattr_data **xattr_value_,
>> + int *xattr_len_, int opened)
>>  {
>>  static const char op[] = "appraise_data";
>>  char *cause = "unknown";
>>  struct dentry *dentry = file_dentry(file);
>>  struct inode *inode = d_backing_inode(dentry);
>>  enum integrity_status status = INTEGRITY_UNKNOWN;
>> -int rc = xattr_len, hash_start = 0;
>> +struct evm_ima_xattr_data *xattr_value = *xattr_value_;
>> +int xattr_len = *xattr_len_, rc = xattr_len, hash_start = 0;
>> +bool appraising_modsig = false;
>> +void *xattr_value_evm;
>> +size_t xattr_len_evm;
>> +
>> +if (iint->flags & IMA_MODSIG_ALLOWED) {
>> +/*
>> + * Not supposed to happen. Hooks that support modsig are
>> + * whitelisted when parsing the policy using
>> + * ima_hooks_supports_modsig.
>> + */
>> +if (!buf || !size)
>> +WARN_ONCE(true, "%s doesn't support modsig\n",
>> +  func_tokens[func]);
>
> ima _appraise_measurement() is getting kind of long. Is there any
> reason we can't move this comment and test to ima_read_modsig()?

I didn't do that because then I would need to pass func as an argument
to ima_read_modsig just to print the warning above. But it does simplify
the code so it may be worth it. I'll make that change in v4.

>> @@ -229,8 +251,24 @@ int ima_appraise_measurement(enum ima_hooks func,
>>  goto out;
>>  }
>> 
>> -status = evm_verifyxattr(dentry, XATTR_NAME_IMA, xattr_value, rc, iint);
>> -if ((status != INTEGRITY_PASS) && (status != INTEGRITY_UNKNOWN)) {
>> +/*
>> + * Appended signatures aren't protected by EVM but we still call
>> + * evm_verifyxattr to check other security xattrs, if they exist.
>> + */
>> +if (appraising_modsig) {
>> +xattr_value_evm = NULL;
>> +xattr_len_evm = 0;
>> +} else {
>> +xattr_value_evm = xattr_value;
>> +xattr_len_evm = xattr_len;
>> +}
>> +
>> +status = evm_verifyxattr(dentry, XATTR_NAME_IMA, xattr_value_evm,
>> + xattr_len_evm, iint);
>> +if (appraising_modsig && status == INTEGRITY_FAIL) {
>> +cause = "invalid-HMAC";
>> +goto out;
>
> "modsig" is special, because having any security xattrs is not
> required. This test doesn't prevent status from being set to
> "missing-HMAC". This test is redundant with the original tests below.

Indeed, that is wrong. I'm still a bit fuzzy about how EVM works and how
it interacts with IMA. The only way I can think of singling out modsig
without reintroduced the complex expression you didn't like in v2 is as
below. What do you think?

@@ -229,8 +241,25 @@ int ima_appraise_measurement(enum ima_hooks func,
goto out;
}
 
-   status = evm_verifyxattr(dentry, XATTR_NAME_IMA, xattr_value, rc, iint);
-   if ((status != INTEGRITY_PASS) && (status != INTEGRITY_UNKNOWN)) {
+   /*
+* Appended signatures aren't protected by EVM but we still call
+* evm_verifyxattr to check other security xattrs, if they exist.
+*/
+   if (appraising_modsig) {
+   xattr_value_evm = NULL;
+   xattr_len_evm = 0;
+   } else {
+   xattr_value_evm = xattr_value;
+   xattr_len_evm = xattr_len;
+   }
+
+   status = evm_verifyxattr(dentry, XATTR_NAME_IMA, xattr_value_evm,
+xattr_len_evm, iint);
+   if (appraising_modsig && (status == INTEGRITY_NOLABEL
+ || status == INTEGRITY_NOXATTRS))
+   /* It's ok if there's no xattr in the case of modsig. */
+   ;
+   else if (status != INTEGRITY_PASS && status != INTEGRITY_UNKNOWN) {
if ((status == INTEGRITY_NOLABEL)
|| (status == INTEGRITY_NOXATTRS))
cause = "missing-HMAC";

>> +} else if (status != INTEGRITY_PASS && status != INTEGRITY_UNKNOWN) {
>>  if ((status == INTEGRITY_NOLABEL)
>>  || (status == INTEGRITY_NOXATTRS))
>> 

Re: [PATCH v3 1/7] integrity: Introduce struct evm_hmac_xattr

2017-08-02 Thread Thiago Jung Bauermann

Hello Mimi,

Thanks for your review!

The patch at the end of the email implements your suggestions, what do
you think?

Mimi Zohar  writes:
> On Thu, 2017-07-06 at 19:17 -0300, Thiago Jung Bauermann wrote:
>> A separate struct evm_hmac_xattr is introduced, with the original
>> definition of evm_ima_xattr_data to be used in the places that actually
>> expect that definition.
>
> The new structure name implies that the xattr can only be an HMAC. By
> moving the new structure to evm.h we also loose the connection that it
> has to theevm_ima_xattr_type enumeration.

Ok, I renamed it to struct evm_xattr.

> Instead, how about defining the new struct in terms of the modified
> evm_ima_xattr_data struct?

IMHO IMA and EVM's practice of mixing and matching structs to compose
its data structures makes the code somewhat confusing and harder to see
where the actual storage used by a given field is actually allocated.

But I don't feel strongly about it, so the patch at the end of this
emails defines it as:

struct evm_xattr {
struct evm_ima_xattr_data data;
u8 digest[SHA1_DIGEST_SIZE];
} __packed;

Another disadvantage of the above is that you have two fields pointing
to the same place: evm_xattr.data.data and evm_xattr.digest.

> Please leave the new structure in integrity.h.

I think that moving the structure in evm.h is helpful. It immediately
conveys that nothing outside of security/integrity/evm/ interprets the
xattr data in the way defined by struct evm_xattr.

But I also don't feel strongly about it, so I moved it to integrity.h.

-- 
Thiago Jung Bauermann
IBM Linux Technology Center


>From df2e29f29c99f5c986d8005d8af8409b3c4d115c Mon Sep 17 00:00:00 2001
From: Thiago Jung Bauermann 
Date: Tue, 16 May 2017 17:14:49 -0300
Subject: [PATCH v4 1/7] integrity: Introduce struct evm_xattr

Even though struct evm_ima_xattr_data includes a fixed-size array to hold a
SHA1 digest, most of the code ignores the array and uses the struct to mean
"type indicator followed by data of unspecified size" and tracks the real
size of what the struct represents in a separate length variable.

The only exception to that is the EVM code, which correctly uses the
definition of struct evm_ima_xattr_data.

This patch makes this explicit in the code by removing the length
specification from the array in struct evm_ima_xattr_data. It also changes
the name of the element from digest to data, since in most places the array
doesn't hold a digest.

A separate struct evm_xattr is introduced, with the original definition of
evm_ima_xattr_data to be used in the places that actually expect that
definition.

Signed-off-by: Thiago Jung Bauermann 
---
 security/integrity/evm/evm_crypto.c   |  4 ++--
 security/integrity/evm/evm_main.c | 10 +-
 security/integrity/ima/ima_appraise.c |  7 ---
 security/integrity/integrity.h|  5 +
 4 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/security/integrity/evm/evm_crypto.c 
b/security/integrity/evm/evm_crypto.c
index d7f282d75cc1..d902a18ef66f 100644
--- a/security/integrity/evm/evm_crypto.c
+++ b/security/integrity/evm/evm_crypto.c
@@ -252,13 +252,13 @@ int evm_update_evmxattr(struct dentry *dentry, const char 
*xattr_name,
const char *xattr_value, size_t xattr_value_len)
 {
struct inode *inode = d_backing_inode(dentry);
-   struct evm_ima_xattr_data xattr_data;
+   struct evm_xattr xattr_data;
int rc = 0;
 
rc = evm_calc_hmac(dentry, xattr_name, xattr_value,
   xattr_value_len, xattr_data.digest);
if (rc == 0) {
-   xattr_data.type = EVM_XATTR_HMAC;
+   xattr_data.data.type = EVM_XATTR_HMAC;
rc = __vfs_setxattr_noperm(dentry, XATTR_NAME_EVM,
   _data,
   sizeof(xattr_data), 0);
diff --git a/security/integrity/evm/evm_main.c 
b/security/integrity/evm/evm_main.c
index 063d38aef64e..536694499515 100644
--- a/security/integrity/evm/evm_main.c
+++ b/security/integrity/evm/evm_main.c
@@ -116,7 +116,7 @@ static enum integrity_status evm_verify_hmac(struct dentry 
*dentry,
 struct integrity_iint_cache *iint)
 {
struct evm_ima_xattr_data *xattr_data = NULL;
-   struct evm_ima_xattr_data calc;
+   struct evm_xattr calc;
enum integrity_status evm_status = INTEGRITY_PASS;
int rc, xattr_len;
 
@@ -147,7 +147,7 @@ static enum integrity_status evm_verify_hmac(struct dentry 
*dentry,
/* check value type */
switch (xattr_data->type) {
case EVM_XATTR_HMAC:
-   if (xattr_len != sizeof(struct evm_ima_xattr_data)) {
+   if (xattr_len != sizeof(struct evm_xattr)) {
evm_status = INTEGRITY_FAIL;
goto out;

[Patch V3] crypto: x86/sha1 : Fix reads beyond the number of blocks passed

2017-08-02 Thread Megha Dey
It was reported that the sha1 AVX2 function(sha1_transform_avx2) is
reading ahead beyond its intended data, and causing a crash if the next
block is beyond page boundary:
http://marc.info/?l=linux-crypto-vger=149373371023377

This patch makes sure that there is no overflow for any buffer length.

It passes the tests written by Jan Stancek that revealed this problem:
https://github.com/jstancek/sha1-avx2-crash

I have re-enabled sha1-avx2 by reverting commit
b82ce24426a4071da9529d726057e4e642948667

Originally-by: Ilya Albrekht 
Signed-off-by: Megha Dey 
Reported-by: Jan Stancek 
---
 arch/x86/crypto/sha1_avx2_x86_64_asm.S | 67 ++
 arch/x86/crypto/sha1_ssse3_glue.c  |  2 +-
 2 files changed, 37 insertions(+), 32 deletions(-)

diff --git a/arch/x86/crypto/sha1_avx2_x86_64_asm.S 
b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
index 1cd792d..1eab79c 100644
--- a/arch/x86/crypto/sha1_avx2_x86_64_asm.S
+++ b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
@@ -117,11 +117,10 @@
.set T1, REG_T1
 .endm
 
-#define K_BASE %r8
 #define HASH_PTR   %r9
+#define BLOCKS_CTR %r8
 #define BUFFER_PTR %r10
 #define BUFFER_PTR2%r13
-#define BUFFER_END %r11
 
 #define PRECALC_BUF%r14
 #define WK_BUF %r15
@@ -205,14 +204,14 @@
 * blended AVX2 and ALU instruction scheduling
 * 1 vector iteration per 8 rounds
 */
-   vmovdqu ((i * 2) + PRECALC_OFFSET)(BUFFER_PTR), W_TMP
+   vmovdqu (i * 2)(BUFFER_PTR), W_TMP
.elseif ((i & 7) == 1)
-   vinsertf128 $1, (((i-1) * 2)+PRECALC_OFFSET)(BUFFER_PTR2),\
+   vinsertf128 $1, ((i-1) * 2)(BUFFER_PTR2),\
 WY_TMP, WY_TMP
.elseif ((i & 7) == 2)
vpshufb YMM_SHUFB_BSWAP, WY_TMP, WY
.elseif ((i & 7) == 4)
-   vpaddd  K_XMM(K_BASE), WY, WY_TMP
+   vpaddd  K_XMM + K_XMM_AR(%rip), WY, WY_TMP
.elseif ((i & 7) == 7)
vmovdqu  WY_TMP, PRECALC_WK(i&~7)
 
@@ -255,7 +254,7 @@
vpxor   WY, WY_TMP, WY_TMP
.elseif ((i & 7) == 7)
vpxor   WY_TMP2, WY_TMP, WY
-   vpaddd  K_XMM(K_BASE), WY, WY_TMP
+   vpaddd  K_XMM + K_XMM_AR(%rip), WY, WY_TMP
vmovdqu WY_TMP, PRECALC_WK(i&~7)
 
PRECALC_ROTATE_WY
@@ -291,7 +290,7 @@
vpsrld  $30, WY, WY
vporWY, WY_TMP, WY
.elseif ((i & 7) == 7)
-   vpaddd  K_XMM(K_BASE), WY, WY_TMP
+   vpaddd  K_XMM + K_XMM_AR(%rip), WY, WY_TMP
vmovdqu WY_TMP, PRECALC_WK(i&~7)
 
PRECALC_ROTATE_WY
@@ -446,6 +445,16 @@
 
 .endm
 
+/* Add constant only if (%2 > %3) condition met (uses RTA as temp)
+ * %1 + %2 >= %3 ? %4 : 0
+ */
+.macro ADD_IF_GE a, b, c, d
+   mov \a, RTA
+   add $\d, RTA
+   cmp $\c, \b
+   cmovge  RTA, \a
+.endm
+
 /*
  * macro implements 80 rounds of SHA-1, for multiple blocks with s/w pipelining
  */
@@ -463,13 +472,16 @@
lea (2*4*80+32)(%rsp), WK_BUF
 
# Precalc WK for first 2 blocks
-   PRECALC_OFFSET = 0
+   ADD_IF_GE BUFFER_PTR2, BLOCKS_CTR, 2, 64
.set i, 0
.rept160
PRECALC i
.set i, i + 1
.endr
-   PRECALC_OFFSET = 128
+
+   /* Go to next block if needed */
+   ADD_IF_GE BUFFER_PTR, BLOCKS_CTR, 3, 128
+   ADD_IF_GE BUFFER_PTR2, BLOCKS_CTR, 4, 128
xchgWK_BUF, PRECALC_BUF
 
.align 32
@@ -479,8 +491,8 @@ _loop:
 * we use K_BASE value as a signal of a last block,
 * it is set below by: cmovae BUFFER_PTR, K_BASE
 */
-   cmp K_BASE, BUFFER_PTR
-   jne _begin
+   test BLOCKS_CTR, BLOCKS_CTR
+   jnz _begin
.align 32
jmp _end
.align 32
@@ -512,10 +524,10 @@ _loop0:
.set j, j+2
.endr
 
-   add $(2*64), BUFFER_PTR   /* move to next odd-64-byte block */
-   cmp BUFFER_END, BUFFER_PTR/* is current block the last one? */
-   cmovae  K_BASE, BUFFER_PTR  /* signal the last iteration smartly */
-
+   /* Update Counter */
+   sub $1, BLOCKS_CTR
+   /* Move to the next block only if needed*/
+   ADD_IF_GE BUFFER_PTR, BLOCKS_CTR, 4, 128
/*
 * rounds
 * 60,62,64,66,68
@@ -532,8 +544,8 @@ _loop0:
UPDATE_HASH 12(HASH_PTR), D
UPDATE_HASH 16(HASH_PTR), E
 
-   cmp K_BASE, BUFFER_PTR  /* is current block the last one? */
-   je  _loop
+   testBLOCKS_CTR, BLOCKS_CTR
+   jz  _loop
 
mov TB, B
 
@@ -575,10 +587,10 @@ _loop2:
.set j, j+2
.endr
 
-   add $(2*64), BUFFER_PTR2  /* move to next even-64-byte block */
-
-   cmp BUFFER_END, 

Re: [Patch V2] crypto: x86/sha1 : Fix reads beyond the number of blocks passed

2017-08-02 Thread Tim Chen
On 08/02/2017 03:39 AM, Jan Stancek wrote:
> On 08/02/2017 02:38 AM, Megha Dey wrote:
>> It was reported that the sha1 AVX2 function(sha1_transform_avx2) is
>> reading ahead beyond its intended data, and causing a crash if the next
>> block is beyond page boundary:
>> http://marc.info/?l=linux-crypto-vger=149373371023377
>>
>> This patch makes sure that there is no overflow for any buffer length.
>>
>> It passes the tests written by Jan Stancek that revealed this problem:
>> https://github.com/jstancek/sha1-avx2-crash
>>
>> Jan, can you verify this fix?
> 
> Hi,
> 
> Looks good, my tests (below) PASS-ed.
> 
> I updated reproducer at [1] to try more than just single scenario. It
> now tries thousands of combinations with different starting offset within
> page and sizes of data chunks.
> 
> (without patch) v4.13-rc3-102-g26c5ceb with avx2 enabled:
> [  106.455175] sha_test module loaded
> [  106.460458] data is at 0x8c88e58f8000, page_after_data: 
> 0x8c88e58fa000
> [  106.468625] sha_test testing hash: sha1-generic, maxlen: 8192
> [  127.091237] sha_test hash: sha1-generic calculated 1764032 hashes
> [  127.098038] sha_test testing hash: sha1-ni, maxlen: 8192
> [  127.108805] failed to alloc sha1-ni
> [  127.112703] sha_test testing hash: sha1-avx, maxlen: 8192
> [  141.166611] sha_test hash: sha1-avx calculated 1764032 hashes
> [  141.173023] sha_test testing hash: sha1-avx2, maxlen: 8192
> [  141.180046] BUG: unable to handle kernel paging request at 8c88e58fa000
> [  141.187817] IP: _begin+0x28/0x187
> [  141.191512] PGD 89c578067
> [  141.191513] P4D 89c578067
> [  141.194529] PUD c61f0a063
> [  141.197545] PMD c64cb1063
> [  141.200561] PTE 800c658fa062
> [  141.203577]
> [  141.208832] Oops:  [#1] SMP
> ...
> 
> With patch "[Patch V2] crypto: x86/sha1 : Fix reads beyond the number of 
> blocks passed":
> [  406.323781] sha_test module loaded
> [  406.329062] data is at 0x93ab97108000, page_after_data: 
> 0x93ab9710a000
> [  406.337185] sha_test testing hash: sha1-generic, maxlen: 8192
> [  426.157242] sha_test hash: sha1-generic calculated 1764032 hashes
> [  426.164043] sha_test testing hash: sha1-ni, maxlen: 8192
> [  426.174244] failed to alloc sha1-ni
> [  426.178139] sha_test testing hash: sha1-avx, maxlen: 8192
> [  440.226240] sha_test hash: sha1-avx calculated 1764032 hashes
> [  440.232651] sha_test testing hash: sha1-avx2, maxlen: 8192
> [  452.497988] sha_test hash: sha1-avx2 calculated 1764032 hashes
> [  452.504495] sha_test testing hash: sha1-ssse3, maxlen: 8192
> [  467.319316] sha_test hash: sha1-ssse3 calculated 1764032 hashes
> [  467.325922] sha_test done
> [  471.827083] sha_test module unloaded
> 
> I also ran a test at [2], which is comparing hashes produced by
> kernel with hashes produced by user-space utilities (e.g. 'sha1sum').
> This test also PASS-ed.

Great.  Should the fix be picked up also in the stable branches since
it was disabled in the stable releases?  Or the changes are too
much for stable?

Tim




Re: [PATCH resend 00/18] crypto: ARM/arm64 roundup for v4.14

2017-08-02 Thread Dave Martin
Hi Herbert,

This series from Ard is a prerequisite for an arm64 series [1] that I'd
like to get merged this cycle (because it is in turn a prerequisite for
another major series I want to progress).

[1] without this series will break the kernel, whereas this series
without [1] won't break the kernel, but will cause performance
regressions in the arm64 crypto code due to unnecessary execution of C
fallbacks.

So it would be good to get both merged this cycle.

Can Ard's series be merged for v4.14, do you think?

I'll let Catalin comment the readiness of [1] for merging via arm64.
(I just need to repost it to fold in a late squash.)

Cheers
---Dave

[1] [RFC PATCH v4 0/5] Simplify kernel-mode NEON
http://lists.infradead.org/pipermail/linux-arm-kernel/2017-July/521838.html


On Mon, Jul 24, 2017 at 11:28:02AM +0100, Ard Biesheuvel wrote:
> This is a resend of all the patches I sent out recently that I would
> like to be considered for v4.14. Their main purpose is to prepare the
> arm64 crypto code to deal with situations where the SIMD register file
> is unavailable, which never occurs at present, but this will change in
> the future when support for SVE is added.
> 
> Patches #1 and #2 have been sent out last week as 'crypto/algapi - refactor
> crypto_xor() to avoid memcpy()s' (v2). This version of #2 fixes an error
> caught by kbuild. The non-SIMD fallback code added in the remaining patches
> relies on crypto_xor() extensively, which is why these patches have been
> included here.
> 
> Patches #3 - #13 implement the non-SIMD fallbacks for the various NEON
> based drivers.
> 
> Patch #14 implements AES-GCM natively instead of relying on the generic
> GCM module to wire accelerated AES-CTR and GHASH together, resulting in
> a ~37% speedup.
> 
> Patches #15 and #16 implement an accelerated GHASH algorithm for ARM cores
> that lack the 64x64 PMULL instruction.
> 
> Patches #17 and #18 update the scalar AES implementations to stop using
> the expanded lookup tables for the final round. This reduces the Dcache
> footprint, and thus the key correlated jitter.
> 
> This supersedes all other crypto patches I have outstanding, including the
> AES refactor ones which I will rework later.
> 
> Ard Biesheuvel (18):
>   crypto/algapi - use separate dst and src operands for __crypto_xor()
>   crypto/algapi - make crypto_xor() take separate dst and src arguments
>   crypto: arm64/ghash-ce - add non-SIMD scalar fallback
>   crypto: arm64/crct10dif - add non-SIMD generic fallback
>   crypto: arm64/crc32 - add non-SIMD scalar fallback
>   crypto: arm64/sha1-ce - add non-SIMD generic fallback
>   crypto: arm64/sha2-ce - add non-SIMD scalar fallback
>   crypto: arm64/aes-ce-cipher - match round key endianness with generic
> code
>   crypto: arm64/aes-ce-cipher: add non-SIMD generic fallback
>   crypto: arm64/aes-ce-ccm: add non-SIMD generic fallback
>   crypto: arm64/aes-blk - add a non-SIMD fallback for synchronous CTR
>   crypto: arm64/chacha20 - take may_use_simd() into account
>   crypto: arm64/aes-bs - implement non-SIMD fallback for AES-CTR
>   crypto: arm64/gcm - implement native driver using v8 Crypto Extensions
>   crypto: arm/ghash - add NEON accelerated fallback for vmull.p64
>   crypto: arm64/ghash - add NEON accelerated fallback for 64-bit PMULL
>   crypto: arm/aes - avoid expanded lookup tables in the final round
>   crypto: arm64/aes - avoid expanded lookup tables in the final round
> 
>  arch/arm/crypto/Kconfig|   5 +-
>  arch/arm/crypto/aes-ce-glue.c  |   4 +-
>  arch/arm/crypto/aes-cipher-core.S  |  88 +++-
>  arch/arm/crypto/aes-neonbs-glue.c  |   5 +-
>  arch/arm/crypto/ghash-ce-core.S| 234 +++--
>  arch/arm/crypto/ghash-ce-glue.c|  24 +-
>  arch/arm64/crypto/Kconfig  |  22 +-
>  arch/arm64/crypto/aes-ce-ccm-core.S|  30 +-
>  arch/arm64/crypto/aes-ce-ccm-glue.c| 174 +--
>  arch/arm64/crypto/aes-ce-cipher.c  |  55 ++-
>  arch/arm64/crypto/aes-ce.S |  12 +-
>  arch/arm64/crypto/aes-cipher-core.S| 152 --
>  arch/arm64/crypto/aes-ctr-fallback.h   |  53 ++
>  arch/arm64/crypto/aes-glue.c   |  63 ++-
>  arch/arm64/crypto/aes-neonbs-glue.c|  53 +-
>  arch/arm64/crypto/chacha20-neon-glue.c |   5 +-
>  arch/arm64/crypto/crc32-ce-glue.c  |  11 +-
>  arch/arm64/crypto/crct10dif-ce-glue.c  |  13 +-
>  arch/arm64/crypto/ghash-ce-core.S  | 401 ++-
>  arch/arm64/crypto/ghash-ce-glue.c  | 517 ++--
>  arch/arm64/crypto/sha1-ce-glue.c   |  18 +-
>  arch/arm64/crypto/sha2-ce-glue.c   |  30 +-
>  arch/arm64/crypto/sha256-glue.c|   1 +
>  arch/sparc/crypto/aes_glue.c   |   3 +-
>  arch/x86/crypto/aesni-intel_glue.c |   4 +-
>  arch/x86/crypto/blowfish_glue.c|   3 +-
>  arch/x86/crypto/cast5_avx_glue.c   |   3 +-
>  arch/x86/crypto/des3_ede_glue.c|   3 +-
>  crypto/algapi.c|  25 +-
>  crypto/ctr.c   

Re: [PATCH v1] crypto: caam - set hwrng quality level

2017-08-02 Thread Horia Geantă
On 7/20/2017 4:08 PM, Harald Freudenberger wrote:
> On 07/19/2017 08:13 PM, Oleksij Rempel wrote:
>> On Wed, Jul 19, 2017 at 04:53:21PM +, Horia Geantă wrote:
>>> On 7/19/2017 7:32 PM, Oleksij Rempel wrote:
 On Wed, Jul 19, 2017 at 12:49:47PM +, Horia Geantă wrote:
> On 7/19/2017 10:45 AM, Oleksij Rempel wrote:
>> According documentation, it is NIST certified TRNG.
>> So, set high quality to let the HWRNG framework automatically use it.
>>
>> Signed-off-by: Oleksij Rempel 
>> ---
>>  drivers/crypto/caam/caamrng.c | 6 ++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/drivers/crypto/caam/caamrng.c 
>> b/drivers/crypto/caam/caamrng.c
>> index 41398da3edf4..684c0bc88dfd 100644
>> --- a/drivers/crypto/caam/caamrng.c
>> +++ b/drivers/crypto/caam/caamrng.c
>> @@ -292,10 +292,16 @@ static int caam_init_rng(struct caam_rng_ctx *ctx, 
>> struct device *jrdev)
>>  return 0;
>>  }
>>  
>> +/*
>> + * hwrng register struct
>> + * The trng is suppost to have 100% entropy, and thus
>> + * we register with a very high quality value.
>> + */
>>  static struct hwrng caam_rng = {
>>  .name   = "rng-caam",
>>  .cleanup= caam_cleanup,
>>  .read   = caam_read,
>> +.quality= 999,
> Why not 1024, i.e. where is 999 coming from?
 It comes from s390-trng.c driver.
 Should I use 1024 instead?

>>> AFAICT the range for quality is from 0 to 1024 (no entropy -> perfect
>>> entropy).
>>>
>>> 1024 should be used since I'd expect a HW TRNG to provide perfect
>>> entropy unless otherwise stated.
>> I assume 1024 can be given only on verified HW with accessible verilog
>> files and compared with resulting chip :)
>> May be this would be a good example https://www.sifive.com/
>>
> Well, the header file says:
> ...
> /**
>  * struct hwrng - Hardware Random Number Generator driver
>  * @name:   Unique RNG name.
>  * @init:   Initialization callback (can be NULL).
>  * @cleanup:Cleanup callback (can be NULL).
>  * @data_present:   Callback to determine if data is available
>  *  on the RNG. If NULL, it is assumed that
>  *  there is always data available.  *OBSOLETE*
>  * @data_read:  Read data from the RNG device.
>  *  Returns the number of lower random bytes in "data".
>  *  Must not be NULL.*OBSOLETE*
>  * @read:   New API. drivers can fill up to max bytes of data
>  *  into the buffer. The buffer is aligned for any type
>  *  and max is a multiple of 4 and >= 32 bytes.
>  * @priv:   Private data, for use by the RNG driver.
>  * @quality:Estimation of true entropy in RNG's bitstream
>  *  (per mill).
>  */
> ...
> quality = estimation of true entropy per mill.
"per mill as in https://en.wikipedia.org/wiki/Mill_(currency) ?
I consider it rather unfortunate, as already noticed here:
https://lkml.org/lkml/2014/3/27/210

And isn't this inaccurate, since the (de)rating factor is quality/1024,
not quality/1000?

> I understand this as quality=1000 meaning 100% entropy.
> However, the core code currently does not really check this value.
> When more than one hwrng sources do register, simple the one with
> the higher quality value wins :-) The value is not even checked
> to be in a given range.
> 
> I searched through some device drivers which do register at
> the hwrng and it looks like most of the drivers do not even
> set this value. My feeling is, you should use 999 when your

Maybe this is because it's not clear how to determine quality's value?

Take CAAM's engine HWRNG: it can work both as a TRNG and as a
TRNG-seeded DRBG (that's how it's currently configured).
IIUC, both setups are fit as source for the entropy pool.

Do I have to set quality value comparing the two cases?
(It's a bit like comparing the quality of entropy offered by RDSEED vs.
RDRAND.)
Meaning: give full credit - maximum quality - for the TRNG setup, and
provide a lower value for quality in the case of TRNG-seeded DRBG?

> hardware provides 'perfect' random. So there is a chance for
> an even 'more perfect' hardware coming up later to overrule
> your 'perfect' hardware.

I am not sure why the hwrng with the best quality wins, instead of using
all available resources, as suggested here:
https://lkml.org/lkml/2014/3/27/210

Thanks,
Horia


Re: [PATCH v3] crypto: ixp4xx - Fix error handling path in 'aead_perform()'

2017-08-02 Thread Arnd Bergmann
On Wed, Aug 2, 2017 at 10:40 AM, Herbert Xu  wrote:
> On Thu, Jul 20, 2017 at 09:37:10AM +0200, Arnd Bergmann wrote:
>
> Oops, looks like I introduced the bug.  Anyway, such is the danger
> of fixing compiler warnings in rarely used code.
>
> For some reason your patch is corrupted in patchwork.  Also we don't
> need to check crypt->dst as free_buf_chain is a no-op if we didn't do
> the allocation at all.  So how about this patch?

Looks good to me.

> ---8<---
> In commit 0f987e25cb8a, the source processing has been moved in front of
> the destination processing, but the error handling path has not been
> modified accordingly.
> Free resources in the correct order to avoid some leaks.
>
> Fixes: 0f987e25cb8a ("crypto: ixp4xx - Fix false lastlen uninitialised 
> warning")
> Reported-by: Christophe JAILLET 
> Signed-off-by: Herbert Xu 

Reviewed-by: Arnd Bergmann 


Re: [Patch V2] crypto: x86/sha1 : Fix reads beyond the number of blocks passed

2017-08-02 Thread Jan Stancek
On 08/02/2017 02:38 AM, Megha Dey wrote:
> It was reported that the sha1 AVX2 function(sha1_transform_avx2) is
> reading ahead beyond its intended data, and causing a crash if the next
> block is beyond page boundary:
> http://marc.info/?l=linux-crypto-vger=149373371023377
> 
> This patch makes sure that there is no overflow for any buffer length.
> 
> It passes the tests written by Jan Stancek that revealed this problem:
> https://github.com/jstancek/sha1-avx2-crash
> 
> Jan, can you verify this fix?

Hi,

Looks good, my tests (below) PASS-ed.

I updated reproducer at [1] to try more than just single scenario. It
now tries thousands of combinations with different starting offset within
page and sizes of data chunks.

(without patch) v4.13-rc3-102-g26c5ceb with avx2 enabled:
[  106.455175] sha_test module loaded
[  106.460458] data is at 0x8c88e58f8000, page_after_data: 
0x8c88e58fa000
[  106.468625] sha_test testing hash: sha1-generic, maxlen: 8192
[  127.091237] sha_test hash: sha1-generic calculated 1764032 hashes
[  127.098038] sha_test testing hash: sha1-ni, maxlen: 8192
[  127.108805] failed to alloc sha1-ni
[  127.112703] sha_test testing hash: sha1-avx, maxlen: 8192
[  141.166611] sha_test hash: sha1-avx calculated 1764032 hashes
[  141.173023] sha_test testing hash: sha1-avx2, maxlen: 8192
[  141.180046] BUG: unable to handle kernel paging request at 8c88e58fa000
[  141.187817] IP: _begin+0x28/0x187
[  141.191512] PGD 89c578067
[  141.191513] P4D 89c578067
[  141.194529] PUD c61f0a063
[  141.197545] PMD c64cb1063
[  141.200561] PTE 800c658fa062
[  141.203577]
[  141.208832] Oops:  [#1] SMP
...

With patch "[Patch V2] crypto: x86/sha1 : Fix reads beyond the number of blocks 
passed":
[  406.323781] sha_test module loaded
[  406.329062] data is at 0x93ab97108000, page_after_data: 
0x93ab9710a000
[  406.337185] sha_test testing hash: sha1-generic, maxlen: 8192
[  426.157242] sha_test hash: sha1-generic calculated 1764032 hashes
[  426.164043] sha_test testing hash: sha1-ni, maxlen: 8192
[  426.174244] failed to alloc sha1-ni
[  426.178139] sha_test testing hash: sha1-avx, maxlen: 8192
[  440.226240] sha_test hash: sha1-avx calculated 1764032 hashes
[  440.232651] sha_test testing hash: sha1-avx2, maxlen: 8192
[  452.497988] sha_test hash: sha1-avx2 calculated 1764032 hashes
[  452.504495] sha_test testing hash: sha1-ssse3, maxlen: 8192
[  467.319316] sha_test hash: sha1-ssse3 calculated 1764032 hashes
[  467.325922] sha_test done
[  471.827083] sha_test module unloaded

I also ran a test at [2], which is comparing hashes produced by
kernel with hashes produced by user-space utilities (e.g. 'sha1sum').
This test also PASS-ed.

Regards,
Jan

[1] https://github.com/jstancek/sha1-avx2-crash
[2] https://github.com/jstancek/kernel_sha_test/tree/sha1-avx2-4.13

> Herbert, can you re-enable sha1-avx2 once Jan has checked it out and
> revert commit b82ce24426a4071da9529d726057e4e642948667 ?
> 
> Originally-by: Ilya Albrekht 
> Signed-off-by: Megha Dey 
> Reported-by: Jan Stancek 




Re: [PATCH v3] crypto: caam: Remove unused dentry members

2017-08-02 Thread Horia Geantă
On 8/1/2017 4:45 PM, Fabio Estevam wrote:
> Most of the dentry members from structure caam_drv_private
> are never used at all, so it is safe to remove them.
> 
> Since debugfs_remove_recursive() is called, we don't need the
> file entries.
> 
> Signed-off-by: Fabio Estevam 
Acked-by: Horia Geantă 

Thanks,
Horia



[PATCH v3] crypto: ixp4xx - Fix error handling path in 'aead_perform()'

2017-08-02 Thread Herbert Xu
On Thu, Jul 20, 2017 at 09:37:10AM +0200, Arnd Bergmann wrote:
>
> Thanks for spotting my mistake!
> 
> I've looked at it again and think it's unfortunately still wrong with
> your patch,
> as there is a 'goto free_buf_src' after dma_pool_alloc(), and that now needs
> to jump to free_buf_dst instead. We may also need an extra check to make
> sure we don't free an uninitialized pointer again.
> 
> Can you have a look at this version below and send whatever you find
> to be correct in the end?
> 
> Signed-off-by: Arnd Bergmann 

Oops, looks like I introduced the bug.  Anyway, such is the danger
of fixing compiler warnings in rarely used code.

For some reason your patch is corrupted in patchwork.  Also we don't
need to check crypt->dst as free_buf_chain is a no-op if we didn't do
the allocation at all.  So how about this patch?

---8<---
In commit 0f987e25cb8a, the source processing has been moved in front of
the destination processing, but the error handling path has not been
modified accordingly.
Free resources in the correct order to avoid some leaks.

Fixes: 0f987e25cb8a ("crypto: ixp4xx - Fix false lastlen uninitialised warning")
Reported-by: Christophe JAILLET 
Signed-off-by: Herbert Xu 

diff --git a/drivers/crypto/ixp4xx_crypto.c b/drivers/crypto/ixp4xx_crypto.c
index 427cbe0..55dc9c2 100644
--- a/drivers/crypto/ixp4xx_crypto.c
+++ b/drivers/crypto/ixp4xx_crypto.c
@@ -1073,7 +1073,7 @@ static int aead_perform(struct aead_request *req, int 
encrypt,
req_ctx->hmac_virt = dma_pool_alloc(buffer_pool, flags,
>icv_rev_aes);
if (unlikely(!req_ctx->hmac_virt))
-   goto free_buf_src;
+   goto free_buf_dst;
if (!encrypt) {
scatterwalk_map_and_copy(req_ctx->hmac_virt,
req->src, cryptlen, authsize, 0);
@@ -1088,10 +1088,10 @@ static int aead_perform(struct aead_request *req, int 
encrypt,
BUG_ON(qmgr_stat_overflow(SEND_QID));
return -EINPROGRESS;
 
-free_buf_src:
-   free_buf_chain(dev, req_ctx->src, crypt->src_buf);
 free_buf_dst:
free_buf_chain(dev, req_ctx->dst, crypt->dst_buf);
+free_buf_src:
+   free_buf_chain(dev, req_ctx->src, crypt->src_buf);
crypt->ctl_flags = CTL_FLAG_UNUSED;
return -ENOMEM;
 }
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt