Re: [PATCH 3/4] crypto: [sha] glue code for Intel SHA extensions optimized SHA1 & SHA256

2015-09-12 Thread Herbert Xu
On Fri, Sep 11, 2015 at 01:10:27PM -0700, Tim Chen wrote:
> 
> Mmmm..., this is a restructuring of the algorithms within
> the glue code into multiple drivers instead of one and exposing
> them all.  It is a bit orthogonal to the intention of this
> patch set.  I think it is better that I create a 
> separate patch on the glue code on top of this patch set 
> to implement this.
> 
> Herbert, do you agree with this approach?

Yes I think we can work on the individual crypto registration
separately from this patch-set.

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] crypto: [sha] glue code for Intel SHA extensions optimized SHA1 & SHA256

2015-09-11 Thread Tim Chen
On Fri, 2015-09-11 at 12:15 -0700, David Miller wrote:
> From: Tim Chen 
> Date: Fri, 11 Sep 2015 11:49:32 -0700
> 
> > Still, why would some kernel module specifically not want to 
> > use the fastest HW implementation, and explicitly ask for
> > a slower driver?
> 
> Temporary workaround if a bug is found.
> 
> There is really no reason to prevent the user from having this
> flexibility, and in return anyone can test any implementation
> their cpu can support.

Mmmm..., this is a restructuring of the algorithms within
the glue code into multiple drivers instead of one and exposing
them all.  It is a bit orthogonal to the intention of this
patch set.  I think it is better that I create a 
separate patch on the glue code on top of this patch set 
to implement this.

Herbert, do you agree with this approach?

Tim

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] crypto: [sha] glue code for Intel SHA extensions optimized SHA1 & SHA256

2015-09-11 Thread Stephan Mueller
Am Freitag, 11. September 2015, 11:49:32 schrieb Tim Chen:

Hi Tim,

>On Fri, 2015-09-11 at 19:02 +0200, Stephan Mueller wrote:
>> Am Donnerstag, 10. September 2015, 17:04:31 schrieb Tim Chen:
>> 
>> Hi Tim,
>> 
>> >Is there a scenario you can think of
>> >when a lower performing sha1 transform needs to
>> >be exposed as a separate driver?
>> 
>> My immediate concern is testing: it is hard to test the individual
>> implementations.
>
>Not hard, just one line in the glue code to set the transform
>to the one you need it you really want to test individual
>implementation.  Usually user of sha don't care which sha driver
>they got, but just the highest priority one.
>So you will anyway need to patch and change the priority of the sha
>driver to expose a specific one for testing.

Sure, it is not hard when you recompile. But when you have to test one given 
kernel binary, it is a challenge.
>
>> >Otherwise the glue code logic will only expose the
>> >best performing one for a cpu and hide the others, which was intentional
>> >on our part to prevent a lower performing sha from getting used.
>> 
>> Agreed, but the kernel crypto API does that already using the priorities --
>> IMHO a very clean and easy to interpret solution.
>> 
>> Furthermore, if somebody really has a need to not use the fastest HW
>> implementation, the kernel crypto API allows him to do that. With the hard-
>> wired approach in the glue file, you are stuck.
>
>Still, why would some kernel module specifically not want to
>use the fastest HW implementation, and explicitly ask for
>a slower driver?

I have seen one instance where a hardware driver was broken on one particular 
hardware. So, the only way was to disable it. In our case here, disabling 
means to go back to the software implementation of SHA.

Ciao
Stephan
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] crypto: [sha] glue code for Intel SHA extensions optimized SHA1 & SHA256

2015-09-11 Thread David Miller
From: Tim Chen 
Date: Fri, 11 Sep 2015 11:49:32 -0700

> Still, why would some kernel module specifically not want to 
> use the fastest HW implementation, and explicitly ask for
> a slower driver?

Temporary workaround if a bug is found.

There is really no reason to prevent the user from having this
flexibility, and in return anyone can test any implementation
their cpu can support.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] crypto: [sha] glue code for Intel SHA extensions optimized SHA1 & SHA256

2015-09-11 Thread Tim Chen
On Fri, 2015-09-11 at 19:02 +0200, Stephan Mueller wrote:
> Am Donnerstag, 10. September 2015, 17:04:31 schrieb Tim Chen:
> 
> Hi Tim,
> >
> >Is there a scenario you can think of
> >when a lower performing sha1 transform needs to
> >be exposed as a separate driver?
> 
> My immediate concern is testing: it is hard to test the individual 
> implementations.
> >

Not hard, just one line in the glue code to set the transform
to the one you need it you really want to test individual 
implementation.  Usually user of sha don't care which sha driver
they got, but just the highest priority one. 
So you will anyway need to patch and change the priority of the sha
driver to expose a specific one for testing.

> >Otherwise the glue code logic will only expose the
> >best performing one for a cpu and hide the others, which was intentional
> >on our part to prevent a lower performing sha from getting used.
> 
> Agreed, but the kernel crypto API does that already using the priorities -- 
> IMHO a very clean and easy to interpret solution.
> 
> Furthermore, if somebody really has a need to not use the fastest HW 
> implementation, the kernel crypto API allows him to do that. With the hard-
> wired approach in the glue file, you are stuck.

Still, why would some kernel module specifically not want to 
use the fastest HW implementation, and explicitly ask for
a slower driver?

Tim

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] crypto: [sha] glue code for Intel SHA extensions optimized SHA1 & SHA256

2015-09-11 Thread Stephan Mueller
Am Donnerstag, 10. September 2015, 17:04:31 schrieb Tim Chen:

Hi Tim,
>
>Is there a scenario you can think of
>when a lower performing sha1 transform needs to
>be exposed as a separate driver?

My immediate concern is testing: it is hard to test the individual 
implementations.
>
>Otherwise the glue code logic will only expose the
>best performing one for a cpu and hide the others, which was intentional
>on our part to prevent a lower performing sha from getting used.

Agreed, but the kernel crypto API does that already using the priorities -- 
IMHO a very clean and easy to interpret solution.

Furthermore, if somebody really has a need to not use the fastest HW 
implementation, the kernel crypto API allows him to do that. With the hard-
wired approach in the glue file, you are stuck.


Ciao
Stephan
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] crypto: [sha] glue code for Intel SHA extensions optimized SHA1 & SHA256

2015-09-10 Thread Tim Chen
On Fri, 2015-09-11 at 00:52 +0200, Stephan Mueller wrote:
> Am Donnerstag, 10. September 2015, 15:27:20 schrieb Tim Chen:
> 
> Hi Tim,
> 
> >This patch adds the glue code to detect and utilize the Intel SHA
> >extensions optimized SHA1 and SHA256 update transforms when available.
> >
> >This code has been tested on Broxton for functionality.
> 
> A general comment on this file: shouldn't this file be cleaned and use the 
> standard mechanisms of the kernel crypto API?
> 
> This glue implements its own selection of which SHA implementation to use. 
> But 
> the kernel crypto API implements that logic already. The issue with the 
> current implementation in this file is that you have no clue which particular 
> implementation of SHA is in use in one particular case.
> 
> So, may I suggest a restructuring to define independent instances of SHA, 
> such 
> as
> 
> - cra_name == "sha1", cra_driver_name="sha1_ssse3", cra_priority=300
> - cra_name == "sha1", cra_driver_name="sha1_avx", cra_priority=400
> - cra_name == "sha1", cra_driver_name="sha1_avx2", cra_priority=500
> - cra_name == "sha1", cra_driver_name="sha1_shavx", cra_priority=600
> 
> Similarly for the other SHAs?
> 
> In all the register functions for the ciphers, you can bail out if the 
> hardware does not support an implementation.

Stephen,

Is there a scenario you can think of 
when a lower performing sha1 transform needs to
be exposed as a separate driver?  

Otherwise the glue code logic will only expose the
best performing one for a cpu and hide the others, which was intentional
on our part to prevent a lower performing sha from getting used.

Tim


--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] crypto: [sha] glue code for Intel SHA extensions optimized SHA1 & SHA256

2015-09-10 Thread Stephan Mueller
Am Donnerstag, 10. September 2015, 15:27:20 schrieb Tim Chen:

Hi Tim,

>This patch adds the glue code to detect and utilize the Intel SHA
>extensions optimized SHA1 and SHA256 update transforms when available.
>
>This code has been tested on Broxton for functionality.

A general comment on this file: shouldn't this file be cleaned and use the 
standard mechanisms of the kernel crypto API?

This glue implements its own selection of which SHA implementation to use. But 
the kernel crypto API implements that logic already. The issue with the 
current implementation in this file is that you have no clue which particular 
implementation of SHA is in use in one particular case.

So, may I suggest a restructuring to define independent instances of SHA, such 
as

- cra_name == "sha1", cra_driver_name="sha1_ssse3", cra_priority=300
- cra_name == "sha1", cra_driver_name="sha1_avx", cra_priority=400
- cra_name == "sha1", cra_driver_name="sha1_avx2", cra_priority=500
- cra_name == "sha1", cra_driver_name="sha1_shavx", cra_priority=600

Similarly for the other SHAs?

In all the register functions for the ciphers, you can bail out if the 
hardware does not support an implementation.

Ciao
Stephan
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] crypto: [sha] glue code for Intel SHA extensions optimized SHA1 & SHA256

2015-09-10 Thread Tim Chen

This patch adds the glue code to detect and utilize the Intel SHA
extensions optimized SHA1 and SHA256 update transforms when available.

This code has been tested on Broxton for functionality.

Originally-by: Chandramouli Narayanan 
Signed-off-by: Tim Chen 
---
 arch/x86/crypto/sha1_ssse3_glue.c   | 12 +++-
 arch/x86/crypto/sha256_ssse3_glue.c | 38 ++---
 2 files changed, 34 insertions(+), 16 deletions(-)

diff --git a/arch/x86/crypto/sha1_ssse3_glue.c 
b/arch/x86/crypto/sha1_ssse3_glue.c
index 7c48e8b..98be8cc 100644
--- a/arch/x86/crypto/sha1_ssse3_glue.c
+++ b/arch/x86/crypto/sha1_ssse3_glue.c
@@ -44,6 +44,10 @@ asmlinkage void sha1_transform_avx(u32 *digest, const char 
*data,
 asmlinkage void sha1_transform_avx2(u32 *digest, const char *data,
unsigned int rounds);
 #endif
+#ifdef CONFIG_AS_SHA1_NI
+asmlinkage void sha1_ni_transform(u32 *digest, const char *data,
+  unsigned int rounds);
+#endif
 
 static void (*sha1_transform_asm)(u32 *, const char *, unsigned int);
 
@@ -166,12 +170,18 @@ static int __init sha1_ssse3_mod_init(void)
 #endif
}
 #endif
+#ifdef CONFIG_AS_SHA1_NI
+   if (boot_cpu_has(X86_FEATURE_SHA_NI)) {
+   sha1_transform_asm = sha1_ni_transform;
+   algo_name = "SHA-NI";
+   }
+#endif
 
if (sha1_transform_asm) {
pr_info("Using %s optimized SHA-1 implementation\n", algo_name);
return crypto_register_shash(&alg);
}
-   pr_info("Neither AVX nor AVX2 nor SSSE3 is available/usable.\n");
+   pr_info("Neither AVX nor AVX2 nor SSSE3/SHA-NI is available/usable.\n");
 
return -ENODEV;
 }
diff --git a/arch/x86/crypto/sha256_ssse3_glue.c 
b/arch/x86/crypto/sha256_ssse3_glue.c
index f8097fc..9c7b22c 100644
--- a/arch/x86/crypto/sha256_ssse3_glue.c
+++ b/arch/x86/crypto/sha256_ssse3_glue.c
@@ -50,6 +50,10 @@ asmlinkage void sha256_transform_avx(u32 *digest, const char 
*data,
 asmlinkage void sha256_transform_rorx(u32 *digest, const char *data,
  u64 rounds);
 #endif
+#ifdef CONFIG_AS_SHA256_NI
+asmlinkage void sha256_ni_transform(u32 *digest, const char *data,
+  u64 rounds); /*unsigned int rounds);*/
+#endif
 
 static void (*sha256_transform_asm)(u32 *, const char *, u64);
 
@@ -142,36 +146,40 @@ static bool __init avx_usable(void)
 
 static int __init sha256_ssse3_mod_init(void)
 {
+   char *algo;
+
/* test for SSSE3 first */
-   if (cpu_has_ssse3)
+   if (cpu_has_ssse3) {
sha256_transform_asm = sha256_transform_ssse3;
+   algo = "SSSE3";
+   }
 
 #ifdef CONFIG_AS_AVX
/* allow AVX to override SSSE3, it's a little faster */
if (avx_usable()) {
+   sha256_transform_asm = sha256_transform_avx;
+   algo = "AVX";
 #ifdef CONFIG_AS_AVX2
-   if (boot_cpu_has(X86_FEATURE_AVX2) && 
boot_cpu_has(X86_FEATURE_BMI2))
+   if (boot_cpu_has(X86_FEATURE_AVX2) &&
+   boot_cpu_has(X86_FEATURE_BMI2)) {
sha256_transform_asm = sha256_transform_rorx;
-   else
+   algo = "AVX2";
+   }
+#endif
+   }
 #endif
-   sha256_transform_asm = sha256_transform_avx;
+#ifdef CONFIG_AS_SHA256_NI
+   if (boot_cpu_has(X86_FEATURE_SHA_NI)) {
+   sha256_transform_asm = sha256_ni_transform;
+   algo = "SHA-256-NI";
}
 #endif
 
if (sha256_transform_asm) {
-#ifdef CONFIG_AS_AVX
-   if (sha256_transform_asm == sha256_transform_avx)
-   pr_info("Using AVX optimized SHA-256 implementation\n");
-#ifdef CONFIG_AS_AVX2
-   else if (sha256_transform_asm == sha256_transform_rorx)
-   pr_info("Using AVX2 optimized SHA-256 
implementation\n");
-#endif
-   else
-#endif
-   pr_info("Using SSSE3 optimized SHA-256 
implementation\n");
+   pr_info("Using %s optimized SHA-256 implementation\n", algo);
return crypto_register_shashes(algs, ARRAY_SIZE(algs));
}
-   pr_info("Neither AVX nor SSSE3 is available/usable.\n");
+   pr_info("Neither AVX nor SSSE3/SHA-NI is available/usable.\n");
 
return -ENODEV;
 }
-- 
2.4.2



--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html