[PATCH 3/3] crypto: x86/chacha20 - Add a 4-block AVX-512VL variant

2018-11-20 Thread Martin Willi
ion of ~20%. Signed-off-by: Martin Willi --- arch/x86/crypto/chacha20-avx512vl-x86_64.S | 272 + arch/x86/crypto/chacha20_glue.c| 7 + 2 files changed, 279 insertions(+) diff --git a/arch/x86/crypto/chacha20-avx512vl-x86_64.S b/arch/x86/crypto/chacha20-avx512vl-x86_

[PATCH 0/3] crypto: x86/chacha20 - AVX-512VL block functions

2018-11-20 Thread Martin Willi
1453 1947 1496 1477 1963 1438 1930 Martin Willi (3): crypto: x86/chacha20 - Add a 8-block AVX-512VL variant crypto: x86/chacha20 - Add a 2-block AVX-512VL variant crypto: x86/chacha20 - Add a 4-block AVX-512VL variant arch/x86/crypto/Makefile | 5 + arch

[PATCH 2/3] crypto: x86/chacha20 - Add a 2-block AVX-512VL variant

2018-11-20 Thread Martin Willi
to process a single block. Hence we engage that function for (partial) single block lengths as well. Signed-off-by: Martin Willi --- arch/x86/crypto/chacha20-avx512vl-x86_64.S | 171 + arch/x86/crypto/chacha20_glue.c| 7 + 2 files changed, 178 insertions(+) diff --git

[PATCH 1/3] crypto: x86/chacha20 - Add a 8-block AVX-512VL variant

2018-11-20 Thread Martin Willi
with dynamic masks is not part of the AVX-512VL instruction set, hence we depend on AVX-512BW as well. Given that the major AVX-512VL architectures provide AVX-512BW and this extension does not affect core clocking, this seems to be no problem at least for now. Signed-off-by: Martin Willi

Re: [PATCH 0/6] crypto: x86/chacha20 - SIMD performance improvements

2018-11-20 Thread Martin Willi
Hi Jason, > [...] I have a massive Xeon Gold 5120 machine that I can give you > access to if you'd like to do some testing and benching. Thanks for the offer, no need at this time. But I certainly would welcome if you could do some (Wireguard) benching with that code to see if it works for you.

Re: [PATCH 0/6] crypto: x86/chacha20 - SIMD performance improvements

2018-11-18 Thread Martin Willi
Hi Jason, > I'd be inclined to roll with your implementation if it can eventually > become competitive with Andy Polyakov's, [...] I think for the SSSE3/AVX2 code paths it is competitive; especially for small sizes it is faster, which is not that unimportant when implementing layer 3 VPNs. >

[PATCH 6/6] crypto: x86/chacha20 - Add a 4-block AVX2 variant

2018-11-11 Thread Martin Willi
it in place. The partial XORing function trailer is very similar to the AVX2 2-block variant. While it could be shared, that code segment is rather short; profiling is also easier with the trailer integrated, so we keep it per function. Signed-off-by: Martin Willi --- arch/x86/crypto/chacha20-avx2

[PATCH 3/6] crypto: x86/chacha20 - Support partial lengths in 8-block AVX2 variant

2018-11-11 Thread Martin Willi
. Signed-off-by: Martin Willi --- arch/x86/crypto/chacha20-avx2-x86_64.S | 189 + arch/x86/crypto/chacha20_glue.c| 5 +- 2 files changed, 133 insertions(+), 61 deletions(-) diff --git a/arch/x86/crypto/chacha20-avx2-x86_64.S b/arch/x86/crypto/chacha20-avx2-x86_64.S

[PATCH 4/6] crypto: x86/chacha20 - Use larger block functions more aggressively

2018-11-11 Thread Martin Willi
Now that all block functions support partial lengths, engage the wider block sizes more aggressively. This prevents using smaller block functions multiple times, where the next larger block function would have been faster. Signed-off-by: Martin Willi --- arch/x86/crypto/chacha20_glue.c | 39

[PATCH 1/6] crypto: x86/chacha20 - Support partial lengths in 1-block SSSE3 variant

2018-11-11 Thread Martin Willi
s probably not worth it. Signed-off-by: Martin Willi --- arch/x86/crypto/chacha20-ssse3-x86_64.S | 74 - arch/x86/crypto/chacha20_glue.c | 11 ++-- 2 files changed, 63 insertions(+), 22 deletions(-) diff --git a/arch/x86/crypto/chacha20-ssse3-x86_64.S b/arch/

[PATCH 2/6] crypto: x86/chacha20 - Support partial lengths in 4-block SSSE3 variant

2018-11-11 Thread Martin Willi
function. Signed-off-by: Martin Willi --- arch/x86/crypto/chacha20-ssse3-x86_64.S | 163 ++-- arch/x86/crypto/chacha20_glue.c | 5 +- 2 files changed, 128 insertions(+), 40 deletions(-) diff --git a/arch/x86/crypto/chacha20-ssse3-x86_64.S b/arch/x86/crypto/chacha20

[PATCH 0/6] crypto: x86/chacha20 - SIMD performance improvements

2018-11-11 Thread Martin Willi
1027 1522 1537 1440 1027 1564 1523 1448 1026 1507 1512 1456 1025 1515 1491 1464 1023 1522 1481 1472 1037 1559 1577 1480 927 1518 1559 1488 926 1514 1548 1496 926 1513 1534 Martin Willi (6): crypto: x86/chacha20 - Support partial lengths in 1-block SSSE3 variant crypto: x86/chacha20

[PATCH 5/6] crypto: x86/chacha20 - Add a 2-block AVX2 variant

2018-11-11 Thread Martin Willi
require a 4-block function. Signed-off-by: Martin Willi --- arch/x86/crypto/chacha20-avx2-x86_64.S | 197 + arch/x86/crypto/chacha20_glue.c| 7 + 2 files changed, 204 insertions(+) diff --git a/arch/x86/crypto/chacha20-avx2-x86_64.S b/arch/x86/crypto/chacha20-avx2

Re: [RFC PATCH] crypto: chacha20 - add implementation using 96-bit nonce

2017-12-10 Thread Martin Willi
Hi, > Anyway, I actually thought it was intentional that the ChaCha > implementations in the Linux kernel allowed specifying the block > counter, and therefore allowed seeking to any point in the keystream, > exposing the full functionality of the cipher. If I remember correctly, it was indeed

Re: [PATCH v4] poly1305: generic C can be faster on chips with slow unaligned access

2016-11-08 Thread Martin Willi
> By using the unaligned access helpers, we drastically improve > performance on small MIPS routers that have to go through the > exception fix-up handler for these unaligned accesses. I couldn't measure any slowdown here, so: Acked-by: Martin Willi <mar...@strongswan.org> >

Re: [PATCH] crypto: chacha20_4block_xor_ssse3: Align stack pointer to 64 bytes

2016-01-22 Thread Martin Willi
ersion seems to be ok, so is Poly1305. Acked-by: Martin Willi <mar...@strongswan.org> -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH v2 03/10] crypto: chacha20 - Add a SSSE3 SIMD variant for x86_64

2015-07-16 Thread Martin Willi
): 5360533 operations in 10 seconds (5489185792 bytes) test 4 (256 bit key, 8192 byte blocks): 692846 operations in 10 seconds (5675794432 bytes) Benchmark results from a Core i5-4670T. Signed-off-by: Martin Willi mar...@strongswan.org --- arch/x86/crypto/Makefile| 2 + arch

[PATCH v2 01/10] crypto: tcrypt - Add ChaCha20/Poly1305 speed tests

2015-07-16 Thread Martin Willi
Adds individual ChaCha20 and Poly1305 and a combined rfc7539esp AEAD speed test using mode numbers 214, 321 and 213. For Poly1305 we add a specific speed template, as it expects the key prepended to the input data. Signed-off-by: Martin Willi mar...@strongswan.org --- crypto/tcrypt.c | 15

[PATCH v2 06/10] crypto: testmgr - Add a longer ChaCha20 test vector

2015-07-16 Thread Martin Willi
The AVX2 variant of ChaCha20 is used only for messages with = 512 bytes length. With the existing test vectors, the implementation could not be tested. Due that lack of such a long official test vector, this one is self-generated using chacha20-generic. Signed-off-by: Martin Willi mar

[PATCH v2 07/10] crypto: poly1305 - Export common Poly1305 helpers

2015-07-16 Thread Martin Willi
As architecture specific drivers need a software fallback, export Poly1305 init/update/final functions together with some helpers in a header file. Signed-off-by: Martin Willi mar...@strongswan.org --- crypto/chacha20poly1305.c | 4 +-- crypto/poly1305_generic.c | 73

[PATCH v2 10/10] crypto: poly1305 - Add a four block AVX2 variant for x86_64

2015-07-16 Thread Martin Willi
): 684405 opers/sec, 2825226316 bytes/sec test 11 ( 8224 byte blocks, 8224 bytes per update, 1 updates): 367101 opers/sec, 3019039446 bytes/sec Benchmark results from a Core i5-4670T. Signed-off-by: Martin Willi mar...@strongswan.org --- arch/x86/crypto/Makefile | 1 + arch

[PATCH v2 05/10] crypto: chacha20 - Add an eight block AVX2 variant for x86_64

2015-07-16 Thread Martin Willi
operations in 10 seconds (18672197632 bytes) Benchmark results from a Core i5-4670T. Signed-off-by: Martin Willi mar...@strongswan.org --- arch/x86/crypto/Makefile | 1 + arch/x86/crypto/chacha20-avx2-x86_64.S | 443 + arch/x86/crypto/chacha20_glue.c

[PATCH v2 00/10] crypto: x86_64 - Add SSE/AVX2 ChaCha20/Poly1305 ciphers

2015-07-16 Thread Martin Willi
for typical IPsec MTUs. On Ivy Bridge using SSE2/SSSE3 the numbers compared to AES-GCM are very similar due to the less efficient CLMUL instructions. Changes in v2: - No code changes - Use sec=10 for more reliable benchmark results Martin Willi (10): crypto: tcrypt - Add ChaCha20/Poly1305 speed

[PATCH v2 04/10] crypto: chacha20 - Add a four block SSSE3 variant for x86_64

2015-07-16 Thread Martin Willi
in 10 seconds (11846409216 bytes) test 4 (256 bit key, 8192 byte blocks): 1448761 operations in 10 seconds (11868250112 bytes) Benchmark results from a Core i5-4670T. Signed-off-by: Martin Willi mar...@strongswan.org --- arch/x86/crypto/chacha20-ssse3-x86_64.S | 483

[PATCH v2 09/10] crypto: poly1305 - Add a two block SSE2 variant for x86_64

2015-07-16 Thread Martin Willi
. Signed-off-by: Martin Willi mar...@strongswan.org --- arch/x86/crypto/poly1305-sse2-x86_64.S | 306 + arch/x86/crypto/poly1305_glue.c| 54 +- 2 files changed, 355 insertions(+), 5 deletions(-) diff --git a/arch/x86/crypto/poly1305-sse2-x86_64.S b/arch/x86

[PATCH v2 08/10] crypto: poly1305 - Add a SSE2 SIMD variant for x86_64

2015-07-16 Thread Martin Willi
test 11 ( 8224 byte blocks, 8224 bytes per update, 1 updates): 153075 opers/sec, 1258896201 bytes/sec Benchmark results from a Core i5-4670T. Signed-off-by: Martin Willi mar...@strongswan.org --- arch/x86/crypto/Makefile | 2 + arch/x86/crypto/poly1305-sse2-x86_64.S | 276

Re: crypto: chacha20poly1305 - Convert to new AEAD interface

2015-07-16 Thread Martin Willi
add my: Tested-by: Martin Willi mar...@strongswan.org Regards Martin -- To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 00/10] crypto: x86_64 - Add SSE/AVX2 ChaCha20/Poly1305 ciphers

2015-07-11 Thread Martin Willi
If you're going to use sec you need to use at least 10 in order for it to be meaningful as shorter values often result in bogus numbers. Ok, I'll use sec=10 in v2. There is no fundamental difference compared to sec=1 (except for very short blocks): testing speed of

Re: [PATCH 00/10] crypto: x86_64 - Add SSE/AVX2 ChaCha20/Poly1305 ciphers

2015-07-08 Thread Martin Willi
Herbert, Running the speed test with sec=1 makes no sense because it's too short. Please use sec=0 and count cycles instead. I get less constant numbers between different runs when using sec=0, hence I've used sec=1. Below are the numbers of average runs for the AEAD measuring cycles; I'll

[PATCH 00/10] crypto: x86_64 - Add SSE/AVX2 ChaCha20/Poly1305 ciphers

2015-07-07 Thread Martin Willi
CLMUL instructions. Martin Willi (10): crypto: tcrypt - Add ChaCha20/Poly1305 speed tests crypto: chacha20 - Export common ChaCha20 helpers crypto: chacha20 - Add a SSSE3 SIMD variant for x86_64 crypto: chacha20 - Add a four block SSSE3 variant for x86_64 crypto: chacha20 - Add an eight

[PATCH 03/10] crypto: chacha20 - Add a SSSE3 SIMD variant for x86_64

2015-07-07 Thread Martin Willi
in 1 seconds (532198400 bytes) test 4 (256 bit key, 8192 byte blocks): 67132 operations in 1 seconds (549945344 bytes) Benchmark results from a Core i5-4670T. Signed-off-by: Martin Willi mar...@strongswan.org --- arch/x86/crypto/Makefile| 2 + arch/x86/crypto/chacha20-ssse3

[PATCH 09/10] crypto: poly1305 - Add a two block SSE2 variant for x86_64

2015-07-07 Thread Martin Willi
. Signed-off-by: Martin Willi mar...@strongswan.org --- arch/x86/crypto/poly1305-sse2-x86_64.S | 306 + arch/x86/crypto/poly1305_glue.c| 54 +- 2 files changed, 355 insertions(+), 5 deletions(-) diff --git a/arch/x86/crypto/poly1305-sse2-x86_64.S b/arch/x86

[PATCH 10/10] crypto: poly1305 - Add a four block AVX2 variant for x86_64

2015-07-07 Thread Martin Willi
): 677578 opers/sec, 2797041984 bytes/sec test 11 ( 8224 byte blocks, 8224 bytes per update, 1 updates): 364094 opers/sec, 2994309056 bytes/sec Benchmark results from a Core i5-4670T. Signed-off-by: Martin Willi mar...@strongswan.org --- arch/x86/crypto/Makefile | 1 + arch

[PATCH 07/10] crypto: poly1305 - Export common Poly1305 helpers

2015-07-07 Thread Martin Willi
As architecture specific drivers need a software fallback, export Poly1305 init/update/final functions together with some helpers in a header file. Signed-off-by: Martin Willi mar...@strongswan.org --- crypto/chacha20poly1305.c | 4 +-- crypto/poly1305_generic.c | 73

[PATCH 05/10] crypto: chacha20 - Add an eight block AVX2 variant for x86_64

2015-07-07 Thread Martin Willi
bytes) Benchmark results from a Core i5-4670T. Signed-off-by: Martin Willi mar...@strongswan.org --- arch/x86/crypto/Makefile | 1 + arch/x86/crypto/chacha20-avx2-x86_64.S | 443 + arch/x86/crypto/chacha20_glue.c| 19 ++ crypto/Kconfig

[PATCH 06/10] crypto: testmgr - Add a longer ChaCha20 test vector

2015-07-07 Thread Martin Willi
The AVX2 variant of ChaCha20 is used only for messages with = 512 bytes length. With the existing test vectors, the implementation could not be tested. Due that lack of such a long official test vector, this one is self-generated using chacha20-generic. Signed-off-by: Martin Willi mar

[PATCH 08/10] crypto: poly1305 - Add a SSE2 SIMD variant for x86_64

2015-07-07 Thread Martin Willi
test 11 ( 8224 byte blocks, 8224 bytes per update, 1 updates): 153136 opers/sec, 1259390464 bytes/sec Benchmark results from a Core i5-4670T. Signed-off-by: Martin Willi mar...@strongswan.org --- arch/x86/crypto/Makefile | 2 + arch/x86/crypto/poly1305-sse2-x86_64.S | 276

[PATCH 01/10] crypto: tcrypt - Add ChaCha20/Poly1305 speed tests

2015-07-07 Thread Martin Willi
Adds individual ChaCha20 and Poly1305 and a combined rfc7539esp AEAD speed test using mode numbers 214, 321 and 213. For Poly1305 we add a specific speed template, as it expects the key prepended to the input data. Signed-off-by: Martin Willi mar...@strongswan.org --- crypto/tcrypt.c | 15

[PATCH] crypto: poly1305 - Pass key as first two message blocks to each desc_ctx

2015-06-16 Thread Martin Willi
The Poly1305 authenticator requires a unique key for each generated tag. This implies that we can't set the key per tfm, as multiple users set individual keys. Instead we pass a desc specific key as the first two blocks of the message to authenticate in update(). Signed-off-by: Martin Willi mar

Re: [PATCH 3/9] crypto: Add a generic Poly1305 authenticator implementation

2015-06-04 Thread Martin Willi
Herbert, I just realised that this doesn't quite work. The key is shared by all users of the tfm, yet in your case you need it to be local I agree, as Poly1305 uses a different key for each tag the current approach doesn't work. I think the simplest solution is to make the key the beginning

[PATCH 7/9] crypto: chacha20poly1305 - Add an IPsec variant for RFC7539 AEAD

2015-06-01 Thread Martin Willi
draft-ietf-ipsecme-chacha20-poly1305 defines the use of ChaCha20/Poly1305 in ESP. It uses additional four byte key material as a salt, which is then used with an 8 byte IV to form the ChaCha20 nonce as defined in the RFC7539. Signed-off-by: Martin Willi mar...@strongswan.org --- crypto

[PATCH 6/9] crypto: testmgr - Add ChaCha20-Poly1305 test vectors from RFC7539

2015-06-01 Thread Martin Willi
Signed-off-by: Martin Willi mar...@strongswan.org --- crypto/testmgr.c | 15 crypto/testmgr.h | 269 +++ 2 files changed, 284 insertions(+) diff --git a/crypto/testmgr.c b/crypto/testmgr.c index faf93a6..915a9ef 100644 --- a/crypto

[PATCH 5/9] crypto: Add a ChaCha20-Poly1305 AEAD construction, RFC7539

2015-06-01 Thread Martin Willi
This AEAD uses a chacha20 ablkcipher and a poly1305 ahash to construct the ChaCha20-Poly1305 AEAD as defined in RFC7539. It supports both synchronous and asynchronous operations, even if we currently have no async chacha20 or poly1305 drivers. Signed-off-by: Martin Willi mar...@strongswan.org

[PATCH 2/9] crypto: testmgr - Add ChaCha20 test vectors from RFC7539

2015-06-01 Thread Martin Willi
We explicitly set the Initial block Counter by prepending it to the nonce in Little Endian. The same test vector is used for both encryption and decryption, ChaCha20 is a cipher XORing a keystream. Signed-off-by: Martin Willi mar...@strongswan.org --- crypto/testmgr.c | 15 + crypto

[PATCH 1/9] crypto: Add a generic ChaCha20 stream cipher implementation

2015-06-01 Thread Martin Willi
. It uses a 16-byte IV, which includes the 12-byte ChaCha20 nonce prepended by the initial block counter. Some algorithms require an explicit counter value, for example the mentioned AEAD construction. Signed-off-by: Martin Willi mar...@strongswan.org --- crypto/Kconfig| 13 +++ crypto

[PATCH 9/9] xfrm: Define ChaCha20-Poly1305 AEAD XFRM algo for IPsec users

2015-06-01 Thread Martin Willi
Signed-off-by: Martin Willi mar...@strongswan.org --- net/xfrm/xfrm_algo.c | 12 1 file changed, 12 insertions(+) diff --git a/net/xfrm/xfrm_algo.c b/net/xfrm/xfrm_algo.c index 67266b7..42f7c76 100644 --- a/net/xfrm/xfrm_algo.c +++ b/net/xfrm/xfrm_algo.c @@ -159,6 +159,18 @@ static

[PATCH 4/9] crypto: testmgr - Add Poly1305 test vectors from RFC7539

2015-06-01 Thread Martin Willi
Signed-off-by: Martin Willi mar...@strongswan.org --- crypto/testmgr.c | 9 ++ crypto/testmgr.h | 259 +++ 2 files changed, 268 insertions(+) diff --git a/crypto/testmgr.c b/crypto/testmgr.c index abd09c2..faf93a6 100644 --- a/crypto

[PATCH 8/9] crypto: testmgr - Add draft-ietf-ipsecme-chacha20-poly1305 test vector

2015-06-01 Thread Martin Willi
Signed-off-by: Martin Willi mar...@strongswan.org --- crypto/testmgr.c | 15 + crypto/testmgr.h | 179 +++ 2 files changed, 194 insertions(+) diff --git a/crypto/testmgr.c b/crypto/testmgr.c index 915a9ef..ccd19cf 100644 --- a/crypto

[PATCH 3/9] crypto: Add a generic Poly1305 authenticator implementation

2015-06-01 Thread Martin Willi
on public domain code by Daniel J. Bernstein and Andrew Moon. Signed-off-by: Martin Willi mar...@strongswan.org --- crypto/Kconfig| 9 ++ crypto/Makefile | 1 + crypto/poly1305_generic.c | 300 ++ 3 files changed, 310 insertions

[PATCH 0/9] crypto: Add ChaCha20-Poly1305 AEAD support for IPsec

2015-06-01 Thread Martin Willi
the IPsec throughput is ~700Mbits/s with these portable drivers. Architecture specific drivers subject to a future patchset can improve performance, for example with SSE doubling performance is feasible. Martin Willi (9): crypto: Add a generic ChaCha20 stream cipher implementation crypto: testmgr

Re: CCM/GCM implementation defect

2015-04-23 Thread Martin Willi
Hi Herbert, Does this mean that even the test vectors (crypto/testmgr.h) are broken? Indeed. The test vectors appear to be generated either through our implementation or by one that is identical to us. I'm not sure about that. RFC4106 refers to [1] for test vectors, which is still

Re: CCM/GCM implementation defect

2015-04-23 Thread Martin Willi
Hi Steffen, It looks like our IPsec implementations of CCM and GCM are buggy in that they don't include the IV in the authentication calculation. Seems like crypto_rfc4106_crypt() passes the associated data it got from ESP directly to gcm, without chaining with the IV. Do you have any

Re: [PATCH 2/3] xfrm: Traffic Flow Confidentiality for IPv4 ESP

2010-12-08 Thread Martin Willi
In particular, why would we need a boundary at all? Setting it to anything other than the PMTU would seem to defeat the purpose of TFC for packets between the boundary and the PMTU. I don't agree, this highly depends on the traffic on the SA. For a general purpose tunnel with TCP flows, PMTU

[PATCH 3/3] xfrm: Traffic Flow Confidentiality for IPv6 ESP

2010-12-08 Thread Martin Willi
Add TFC padding to all packets smaller than the boundary configured on the xfrm state. If the boundary is larger than the PMTU, limit padding to the PMTU. Signed-off-by: Martin Willi mar...@strongswan.org --- net/ipv6/esp6.c | 32 1 files changed, 24 insertions

[PATCH 2/3] xfrm: Traffic Flow Confidentiality for IPv4 ESP

2010-12-08 Thread Martin Willi
Add TFC padding to all packets smaller than the boundary configured on the xfrm state. If the boundary is larger than the PMTU, limit padding to the PMTU. Signed-off-by: Martin Willi mar...@strongswan.org --- net/ipv4/esp4.c | 32 1 files changed, 24 insertions

[PATCH 0/3] xfrm: ESP Traffic Flow Confidentiality padding (v3)

2010-12-08 Thread Martin Willi
. Changes from v2: - Remove unused flag field in attribute, use a plain u32 as attribute payload - Reject installation of TFC padding on non-tunnel SAs Martin Willi (3): xfrm: Add Traffic Flow Confidentiality padding XFRM attribute xfrm: Traffic Flow Confidentiality for IPv4 ESP

[PATCH 1/3] xfrm: Add Traffic Flow Confidentiality padding XFRM attribute

2010-12-08 Thread Martin Willi
The XFRMA_TFCPAD attribute for XFRM state installation configures Traffic Flow Confidentiality by padding ESP packets to a specified length. Signed-off-by: Martin Willi mar...@strongswan.org --- include/linux/xfrm.h |1 + include/net/xfrm.h |1 + net/xfrm/xfrm_user.c | 19

[PATCH 2/3] xfrm: Traffic Flow Confidentiality for IPv4 ESP

2010-12-07 Thread Martin Willi
Add TFC padding to all packets smaller than the boundary configured on the xfrm state. If the boundary is larger than the PMTU, limit padding to the PMTU. Signed-off-by: Martin Willi mar...@strongswan.org --- net/ipv4/esp4.c | 33 + 1 files changed, 25

[PATCH 3/3] xfrm: Traffic Flow Confidentiality for IPv6 ESP

2010-12-07 Thread Martin Willi
Add TFC padding to all packets smaller than the boundary configured on the xfrm state. If the boundary is larger than the PMTU, limit padding to the PMTU. Signed-off-by: Martin Willi mar...@strongswan.org --- net/ipv6/esp6.c | 33 + 1 files changed, 25

[PATCH 0/3] xfrm: ESP Traffic Flow Confidentiality padding (v2)

2010-12-07 Thread Martin Willi
the currently unused flags in the XFRM attribute to implement ESPv2 fallback or other extensions in the future without changing the ABI. Martin Willi (3): xfrm: Add Traffic Flow Confidentiality padding XFRM attribute xfrm: Traffic Flow Confidentiality for IPv4 ESP xfrm: Traffic Flow

[PATCH 1/3] xfrm: Add Traffic Flow Confidentiality padding XFRM attribute

2010-12-07 Thread Martin Willi
The XFRMA_TFC attribute for XFRM state installation configures Traffic Flow Confidentiality by padding ESP packets to a specified length. Signed-off-by: Martin Willi mar...@strongswan.org --- include/linux/xfrm.h |6 ++ include/net/xfrm.h |1 + net/xfrm/xfrm_user.c | 16

Re: [PATCH 3/5] xfrm: Traffic Flow Confidentiality for IPv4 ESP

2010-12-06 Thread Martin Willi
Hi Herbert, I know why you want to do this, what I'm asking is do you have any research behind this with regards to security Has this scheme been discussed on a public forum somewhere? No, sorry, I haven't found much valuable discussion about TFC padding. Nothing at all how to overcome the

Re: [PATCH 3/5] xfrm: Traffic Flow Confidentiality for IPv4 ESP

2010-12-03 Thread Martin Willi
What is the basis of this random length padding? Let assume a peer does not support ESPv3 padding, but we have to pad a small packet with more than 255 bytes. We can't, the ESP padding length field is limited to 255. We could add 255 fixed bytes, but an eavesdropper could just subtract the 255

[PATCH 3/5] xfrm: Traffic Flow Confidentiality for IPv4 ESP

2010-11-30 Thread Martin Willi
If configured on xfrm state, increase the length of all packets to a given boundary using TFC padding as specified in RFC4303. For transport mode, or if the XFRM_TFC_ESPV3 is not set, grow the ESP padding field instead. Signed-off-by: Martin Willi mar...@strongswan.org --- net/ipv4/esp4.c | 42

[PATCH 2/5] xfrm: Remove unused ESP padlen field

2010-11-30 Thread Martin Willi
The padlen field in IPv4/6 ESP is used to align the ESP padding length to a value larger than the aead block size. There is however no option to set this field, hence it is removed. Signed-off-by: Martin Willi mar...@strongswan.org --- include/net/esp.h |3 --- net/ipv4/esp4.c | 11

[PATCH 4/5] xfrm: Traffic Flow Confidentiality for IPv6 ESP

2010-11-30 Thread Martin Willi
If configured on xfrm state, increase the length of all packets to a given boundary using TFC padding as specified in RFC4303. For transport mode, or if the XFRM_TFC_ESPV3 is not set, grow the ESP padding field instead. Signed-off-by: Martin Willi mar...@strongswan.org --- net/ipv6/esp6.c | 42

[PATCH 0/5] xfrm: ESP Traffic Flow Confidentiality padding

2010-11-30 Thread Martin Willi
, but I'm not sure if my PMTU lookup works in all cases (nested transforms?). Any pointer would be appreciated. Martin Willi (5): xfrm: Add Traffic Flow Confidentiality padding XFRM attribute xfrm: Remove unused ESP padlen field xfrm: Traffic Flow Confidentiality for IPv4 ESP xfrm

[PATCH 1/5] xfrm: Add Traffic Flow Confidentiality padding XFRM attribute

2010-11-30 Thread Martin Willi
The XFRMA_TFCPAD attribute for XFRM state installation configures Traffic Flow Confidentiality by padding ESP packets to a specified length. To use RFC4303 TFC padding and overcome the 255 byte ESP padding field limit, the XFRM_TFC_ESPV3 flag must be set. Signed-off-by: Martin Willi mar

Re: [PATCH 4/4] crypto: algif_skcipher - User-space interface for skcipher operations

2010-11-15 Thread Martin Willi
This patch adds the af_alg plugin for symmetric key ciphers, corresponding to the ablkcipher kernel operation type. I can confirm that the newest patch fixes the page leak. Tested-by: Martin Willi mar...@strongswan.org -- To unsubscribe from this list: send the line unsubscribe linux-crypto

Re: [PATCH 2/4] crypto: af_alg - User-space interface for Crypto API

2010-11-15 Thread Martin Willi
This patch creates the backbone of the user-space interface for the Crypto API, through a new socket family AF_ALG. Tested-by: Martin Willi mar...@strongswan.org -- To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to majord...@vger.kernel.org

Re: [PATCH 3/4] crypto: algif_hash - User-space interface for hash operations

2010-11-15 Thread Martin Willi
This patch adds the af_alg plugin for hash, corresponding to the ahash kernel operation type. Tested-by: Martin Willi mar...@strongswan.org -- To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to majord...@vger.kernel.org More majordomo info

Re: [PATCH 4/4] crypto: algif_skcipher - User-space interface for skcipher operations

2010-11-08 Thread Martin Willi
Hmm, can you show me your test program and how you determined that it was leaking pages? The test program below runs 1000 encryptions: # grep nr_free /proc/vmstat nr_free_pages 11031 # ./test ... # grep nr_free /proc/vmstat nr_free_pages 10026 # ./test ... # grep nr_free /proc/vmstat

Re: [PATCH 4/4] crypto: algif_skcipher - User-space interface for skcipher operations

2010-11-06 Thread Martin Willi
Hi Herbert, I did a proof-of-concept implementation for our crypto library, the interface looks good so far. All our hash, hmac, xcbc and cipher test vectors matched. + sg_assign_page(sg + i, alloc_page(GFP_KERNEL)); Every skcipher operation leaks memory on my box (this

[PATCH] xfrm: Fix truncation length of authentication algorithms installed via PF_KEY

2009-12-09 Thread Martin Willi
-off-by: Martin Willi mar...@strongswan.org --- net/key/af_key.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/net/key/af_key.c b/net/key/af_key.c index 84209fb..76fa6fe 100644 --- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -1193,6 +1193,7 @@ static struct xfrm_state

[PATCH 3/3] xfrm: Use the user specified truncation length in ESP and AH

2009-11-25 Thread Martin Willi
Instead of using the hardcoded truncation for authentication algorithms, use the truncation length specified on xfrm_state. Signed-off-by: Martin Willi mar...@strongswan.org --- net/ipv4/ah4.c |2 +- net/ipv4/esp4.c |2 +- net/ipv6/ah6.c |2 +- net/ipv6/esp6.c |2 +- 4 files

[PATCH 0/3] xfrm: Custom truncation lengths for authentication algorithms

2009-11-25 Thread Martin Willi
The following patchset adds support for defining truncation lengths for authentication algorithms in userspace. The main purpose for this is to support SHA256 in IPsec using the standardized 128 bit instead of the currently used 96 bit truncation. Martin Willi (3): xfrm: Define new XFRM netlink

[PATCH 2/3] xfrm: Store aalg in xfrm_state with a user specified truncation length

2009-11-25 Thread Martin Willi
is specified, or the authentication algorithm is specified using xfrm_algo, the truncation length from the algorithm description in the kernel is used. Signed-off-by: Martin Willi mar...@strongswan.org --- include/net/xfrm.h| 12 - net/xfrm/xfrm_state.c |2 +- net/xfrm/xfrm_user.c | 129

[PATCH 1/3] xfrm: Define new XFRM netlink auth attribute with specified truncation bits

2009-11-25 Thread Martin Willi
The new XFRMA_ALG_AUTH_TRUNC attribute taking a xfrm_algo_auth as argument allows the installation of authentication algorithms with a truncation length specified in userspace, i.e. SHA256 with 128 bit instead of 96 bit truncation. Signed-off-by: Martin Willi mar...@strongswan.org --- include

[PATCH] xfrm: Add SHA384 and SHA512 HMAC authentication algorithms to XFRM

2009-11-25 Thread Martin Willi
These algorithms use a truncation of 192/256 bits, as specified in RFC4868. Signed-off-by: Martin Willi mar...@strongswan.org --- net/xfrm/xfrm_algo.c | 34 ++ 1 files changed, 34 insertions(+), 0 deletions(-) diff --git a/net/xfrm/xfrm_algo.c b/net/xfrm

Re: HMAC regression

2009-05-31 Thread Martin Willi
You must getting an sg entry that crosses a page boundary, rather than two sg entries that both stay within a page. Yes. These things are very rare, and usually occurs as a result of SLAB debugging causing kmalloc to return memory that crosses page boundaries. Indeed, SLAB_DEBUG was

HMAC regression

2009-05-28 Thread Martin Willi
Hi, Switching the hash implementations to the new shash API introduced a regression. HMACs are created incorrectly if the data is scattered over multiple pages, resulting in very unreliable IPsec tunnels. The appended patch adds a silly hmac(sha1) test vector larger than a 4KB page and fails on