Re: [PATCH -v4] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-11-03 Thread Ingo Molnar
* Huang Ying ying.hu...@intel.com wrote: On Mon, 2009-11-02 at 22:32 +0800, Ingo Molnar wrote: * Herbert Xu herb...@gondor.apana.org.au wrote: On Mon, Nov 02, 2009 at 08:50:39AM +0100, Ingo Molnar wrote: A cleanup request: mind creating two macros for this PSHUFB MMX/SSE

Re: [PATCH -v4] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-11-03 Thread Herbert Xu
On Mon, Nov 02, 2009 at 04:46:04PM +0100, Ingo Molnar wrote: Yeah. Or just a single block of: #ifndef __ASSEMBLY__ ... #endif /* __ASSEMBLY__ */ around the C bits - anything outside that is good for assembly as well. OK I'll throw this into cryptodev: commit

Re: [PATCH -v4] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-11-03 Thread H. Peter Anvin
On 11/03/2009 01:03 AM, Ingo Molnar wrote: .macro xmm_num opd xmm .ifc \xmm,%xmm0 \opd = 0 .endif .ifc \xmm,%xmm1 \opd = 1 .endif .ifc \xmm,%xmm2 \opd = 2 .endif .ifc \xmm,%xmm3 \opd = 3 .endif .ifc \xmm,%xmm4 \opd = 4 .endif .ifc \xmm,%xmm5 \opd = 5 .endif .ifc \xmm,%xmm6

Re: [PATCH -v4] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-11-02 Thread Herbert Xu
On Mon, Nov 02, 2009 at 08:50:39AM +0100, Ingo Molnar wrote: A cleanup request: mind creating two macros for this PSHUFB MMX/SSE instruction in arch/x86/include/asm/i387.h, instead of open-coding the .byte sequences in ~6 places? I had a go at doing that, but it seems that i387.h isn't

Re: [PATCH -v4] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-11-02 Thread Ingo Molnar
* Herbert Xu herb...@gondor.apana.org.au wrote: On Mon, Nov 02, 2009 at 08:50:39AM +0100, Ingo Molnar wrote: A cleanup request: mind creating two macros for this PSHUFB MMX/SSE instruction in arch/x86/include/asm/i387.h, instead of open-coding the .byte sequences in ~6 places? I

Re: [PATCH -v4] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-11-02 Thread Herbert Xu
On Mon, Nov 02, 2009 at 03:32:58PM +0100, Ingo Molnar wrote: Please use the standard construct and put an #ifndef __ASSEMBLY__ around it. You mean like this? diff --git a/arch/x86/include/asm/i387.h b/arch/x86/include/asm/i387.h index 0b20bbb..e22d237 100644 ---

Re: [PATCH -v4] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-11-02 Thread Ingo Molnar
* Herbert Xu herb...@gondor.apana.org.au wrote: On Mon, Nov 02, 2009 at 03:32:58PM +0100, Ingo Molnar wrote: Please use the standard construct and put an #ifndef __ASSEMBLY__ around it. You mean like this? diff --git a/arch/x86/include/asm/i387.h b/arch/x86/include/asm/i387.h

Re: [PATCH -v4] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-11-02 Thread Huang Ying
On Mon, 2009-11-02 at 22:32 +0800, Ingo Molnar wrote: * Herbert Xu herb...@gondor.apana.org.au wrote: On Mon, Nov 02, 2009 at 08:50:39AM +0100, Ingo Molnar wrote: A cleanup request: mind creating two macros for this PSHUFB MMX/SSE instruction in arch/x86/include/asm/i387.h,

Re: [PATCH -v4] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-11-01 Thread Herbert Xu
On Sat, Oct 31, 2009 at 05:30:15PM -0700, Andrew Morton wrote: x86_64 allmodconfig, GNU assembler 2.16.1: arch/x86/crypto/ghash-clmulni-intel_asm.S: Assembler messages: arch/x86/crypto/ghash-clmulni-intel_asm.S:103: Error: no such instruction: `pshufb %xmm5,%xmm0'

Re: [PATCH -v4] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-11-01 Thread Ingo Molnar
* Herbert Xu herb...@gondor.apana.org.au wrote: - pshufb BSWAP, DATA + # pshufb BSWAP, DATA + .byte 0x66, 0x0f, 0x38, 0x00, 0xc5 A cleanup request: mind creating two macros for this PSHUFB MMX/SSE instruction in arch/x86/include/asm/i387.h, instead of open-coding the .byte

Re: [PATCH -v4] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-10-31 Thread Andrew Morton
On Mon, 19 Oct 2009 11:53:33 +0900 Herbert Xu herb...@gondor.apana.org.au wrote: On Wed, Sep 16, 2009 at 09:35:46AM +0800, Huang Ying wrote: PCLMULQDQ is used to accelerate the most time-consuming part of GHASH, carry-less multiplication. More information about PCLMULQDQ can be found at:

Re: [PATCH -v4] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-10-18 Thread Herbert Xu
On Wed, Sep 16, 2009 at 09:35:46AM +0800, Huang Ying wrote: PCLMULQDQ is used to accelerate the most time-consuming part of GHASH, carry-less multiplication. More information about PCLMULQDQ can be found at:

[PATCH -v4] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-09-15 Thread Huang Ying
PCLMULQDQ is used to accelerate the most time-consuming part of GHASH, carry-less multiplication. More information about PCLMULQDQ can be found at: http://software.intel.com/en-us/articles/carry-less-multiplication-and-its-usage-for-computing-the-gcm-mode/ Because PCLMULQDQ changes XMM state,