[PATCH 1/3] crypto: aesni - fix counter overflow handling in "by8" variant

2014-09-28 Thread Mathias Krause
The "by8" CTR AVX implementation fails to propperly handle counter
overflows. That was the reason it got disabled in commit 7da4b29d496b
("crypto: aesni - disable "by8" AVX CTR optimization").

Fix the overflow handling by incrementing the counter block as a double
quad word, i.e. a 128 bit, and testing for overflows afterwards. We need
to use VPTEST to do so as VPADD* does not set the flags itself and
silently drops the carry bit.

As this change adds branches to the hot path, minor performance
regressions  might be a side effect. But, OTOH, we now have a conforming
implementation -- the preferable goal.

A tcrypt test on a SandyBridge system (i7-2620M) showed almost identical
numbers for the old and this version with differences within the noise
range. A dm-crypt test with the fixed version gave even slightly better
results for this version. So the performance impact might not be as big
as expected.

Tested-by: Romain Francoise 
Signed-off-by: Mathias Krause 
Cc: Chandramouli Narayanan 
---
 arch/x86/crypto/aes_ctrby8_avx-x86_64.S |   17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/arch/x86/crypto/aes_ctrby8_avx-x86_64.S 
b/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
index f091f122ed24..a029bc744244 100644
--- a/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
+++ b/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
@@ -108,6 +108,10 @@
 
 byteswap_const:
.octa 0x000102030405060708090A0B0C0D0E0F
+ddq_low_msk:
+   .octa 0x
+ddq_high_add_1:
+   .octa 0x0001
 ddq_add_1:
.octa 0x0001
 ddq_add_2:
@@ -169,7 +173,12 @@ ddq_add_8:
.rept (by - 1)
club DDQ_DATA, i
club XDATA, i
-   vpaddd  var_ddq_add(%rip), xcounter, var_xdata
+   vpaddq  var_ddq_add(%rip), xcounter, var_xdata
+   vptest  ddq_low_msk(%rip), var_xdata
+   jnz 1f
+   vpaddq  ddq_high_add_1(%rip), var_xdata, var_xdata
+   vpaddq  ddq_high_add_1(%rip), xcounter, xcounter
+   1:
vpshufb xbyteswap, var_xdata, var_xdata
.set i, (i +1)
.endr
@@ -178,7 +187,11 @@ ddq_add_8:
 
vpxor   xkey0, xdata0, xdata0
club DDQ_DATA, by
-   vpaddd  var_ddq_add(%rip), xcounter, xcounter
+   vpaddq  var_ddq_add(%rip), xcounter, xcounter
+   vptest  ddq_low_msk(%rip), xcounter
+   jnz 1f
+   vpaddq  ddq_high_add_1(%rip), xcounter, xcounter
+   1:
 
.set i, 1
.rept (by - 1)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] Revert "crypto: aesni - disable "by8" AVX CTR optimization"

2014-09-28 Thread Mathias Krause
This reverts commit 7da4b29d496b1389d3a29b55d3668efecaa08ebd.

Now, that the issue is fixed, we can re-enable the code.

Signed-off-by: Mathias Krause 
Cc: Chandramouli Narayanan 
---
 arch/x86/crypto/aesni-intel_glue.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/crypto/aesni-intel_glue.c 
b/arch/x86/crypto/aesni-intel_glue.c
index a7ccd57f19e4..888950f29fd9 100644
--- a/arch/x86/crypto/aesni-intel_glue.c
+++ b/arch/x86/crypto/aesni-intel_glue.c
@@ -481,7 +481,7 @@ static void ctr_crypt_final(struct crypto_aes_ctx *ctx,
crypto_inc(ctrblk, AES_BLOCK_SIZE);
 }
 
-#if 0  /* temporary disabled due to failing crypto tests */
+#ifdef CONFIG_AS_AVX
 static void aesni_ctr_enc_avx_tfm(struct crypto_aes_ctx *ctx, u8 *out,
  const u8 *in, unsigned int len, u8 *iv)
 {
@@ -1522,7 +1522,7 @@ static int __init aesni_init(void)
aesni_gcm_dec_tfm = aesni_gcm_dec;
}
aesni_ctr_enc_tfm = aesni_ctr_enc;
-#if 0  /* temporary disabled due to failing crypto tests */
+#ifdef CONFIG_AS_AVX
if (cpu_has_avx) {
/* optimize performance of ctr mode encryption transform */
aesni_ctr_enc_tfm = aesni_ctr_enc_avx_tfm;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] crypto: aesni - remove unused defines in "by8" variant

2014-09-28 Thread Mathias Krause
The defines for xkey3, xkey6 and xkey9 are not used in the code. They're
probably left overs from merging the three source files for 128, 192 and
256 bit AES. They can safely be removed.

Signed-off-by: Mathias Krause 
Cc: Chandramouli Narayanan 
---
 arch/x86/crypto/aes_ctrby8_avx-x86_64.S |3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/x86/crypto/aes_ctrby8_avx-x86_64.S 
b/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
index a029bc744244..2df2a0298f5a 100644
--- a/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
+++ b/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
@@ -79,9 +79,6 @@
 #define xcounter   %xmm8
 #define xbyteswap  %xmm9
 #define xkey0  %xmm10
-#define xkey3  %xmm11
-#define xkey6  %xmm12
-#define xkey9  %xmm13
 #define xkey4  %xmm11
 #define xkey8  %xmm12
 #define xkey12 %xmm13
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3] crypto: aesni - fix and re-enable "by8" CTR variant

2014-09-28 Thread Mathias Krause
This series fixes the counter overflow handling of the "by8" CTR variant
which lead to failing cryptomgr tests and, in turn, disabling this
optimization with commit 7da4b29d496b.

Patch 1 fixes the bug, patch 2 removes some unused defines (left overs
from the unification of the initial source files) and patch 3 re-enables
the code.

The fix was tested by me, doing tcrypt and dm-crypt tests. It was also
tested by Romain who initially reported the issue.

The patches should go on top of crypto-2.6.git.

In case this doesn't get merged for v3.17, patches 1 and 3 may be cc'ed
to stable to propagate the fix.

Please apply!

Thanks,
Mathias


Mathias Krause (3):
  crypto: aesni - fix counter overflow handling in "by8" variant
  crypto: aesni - remove unused defines in "by8" variant
  Revert "crypto: aesni - disable "by8" AVX CTR optimization"

 arch/x86/crypto/aes_ctrby8_avx-x86_64.S |   20 +++-
 arch/x86/crypto/aesni-intel_glue.c  |4 ++--
 2 files changed, 17 insertions(+), 7 deletions(-)

-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html