From: Antoine Tenart
> Sent: 02 May 2018 10:57
> Adds the AEAD_REQUEST_ON_STACK primitive to allow allocating AEAD
> requests on the stack, as it can already be done with various other
> crypto algorithms within the kernel.
>
> Signed-off-by: Antoine Tenart
> ---
>
From: Salvatore Mesoraca
> Sent: 09 April 2018 17:38
...
> > You can also do much better than allocating MAX_BLOCKSIZE + MAX_ALIGNMASK
> > bytes by requesting 'long' aligned on-stack memory.
> > The easiest way is to define a union like:
> >
> > union crypto_tmp {
> > u8
From: Salvatore Mesoraca
> Sent: 09 April 2018 14:55
>
> v2:
> As suggested by Herbert Xu, the blocksize and alignmask checks
> have been moved to crypto_check_alg.
> So, now, all the other separate checks are not necessary.
> Also, the defines have been moved to
From: Eric Biggers
> Sent: 14 March 2018 18:32
...
> Also, I recall there being a long discussion a while back about how
> __aligned(16) doesn't work on local variables because the kernel's stack
> pointer
> isn't guaranteed to maintain the alignment assumed by the compiler (see commit
>
From: Juergen Gross
> Sent: 19 December 2017 08:05
..
>
> Exchanging 2 registers can be done without memory access via:
>
> xor reg1, reg2
> xor reg2, reg1
> xor reg1, reg2
That'll generate horrid data dependencies.
ISTR that there are some optimisations for the stack,
so even 'push reg1', 'mov
From: Logan Gunthorpe
> Sent: 13 April 2017 23:05
> Straightforward conversion to the new helper, except due to
> the lack of error path, we have to warn if unmapable memory
> is ever present in the sgl.
>
> Signed-off-by: Logan Gunthorpe
> ---
>
From: Daniel Axtens
> Sent: 15 March 2017 22:30
> Hi David,
>
> > While not part of this change, the unrolled loops look as though
> > they just destroy the cpu cache.
> > I'd like be convinced that anything does CRC over long enough buffers
> > to make it a gain at all.
> >
> > With modern (not
From: Linuxppc-dev Daniel Axtens
> Sent: 15 March 2017 12:38
> The core nuts and bolts of the crc32c vpmsum algorithm will
> also work for a number of other CRC algorithms with different
> polynomials. Factor out the function into a new asm file.
>
> To handle multiple users of the function, a
From: Andy Lutomirski
> Sent: 10 January 2017 23:25
> There are some hashes (e.g. sha224) that have some internal trickery
> to make sure that only the correct number of output bytes are
> generated. If something goes wrong, they could potentially overrun
> the output buffer.
>
> Make the test
From: George Spelvin
> Sent: 17 December 2016 15:21
...
> uint32_t
> hsiphash24(char const *in, size_t len, uint32_t const key[2])
> {
> uint32_t c = key[0];
> uint32_t d = key[1];
> uint32_t a = 0x6c796765 ^ 0x736f6d65;
> uint32_t b = d ^ 0x74656462 ^ 0x646f7261;
I've
From: George Spelvin
> Sent: 15 December 2016 23:29
> > If a halved version of SipHash can bring significant performance boost
> > (with 32b words instead of 64b words) with an acceptable security level
> > (64-bit enough?) then we may design such a version.
>
> I was thinking if the key could be
From: Jason A. Donenfeld
> Sent: 15 December 2016 20:30
> These restore parity with the jhash interface by providing high
> performance helpers for common input sizes.
...
> +#define PREAMBLE(len) \
> + u64 v0 = 0x736f6d6570736575ULL; \
> + u64 v1 = 0x646f72616e646f6dULL; \
> + u64 v2
From: Jason A. Donenfeld
> Sent: 15 December 2016 20:30
> This gives a clear speed and security improvement. Siphash is both
> faster and is more solid crypto than the aging MD5.
>
> Rather than manually filling MD5 buffers, for IPv6, we simply create
> a layout by a simple anonymous struct, for
From: Hannes Frederic Sowa
> Sent: 15 December 2016 14:57
> On 15.12.2016 14:56, David Laight wrote:
> > From: Hannes Frederic Sowa
> >> Sent: 15 December 2016 12:50
> >> On 15.12.2016 13:28, David Laight wrote:
> >>> From: Hannes Frederi
From: Hannes Frederic Sowa
> Sent: 15 December 2016 12:50
> On 15.12.2016 13:28, David Laight wrote:
> > From: Hannes Frederic Sowa
> >> Sent: 15 December 2016 12:23
> > ...
> >> Hmm? Even the Intel ABI expects alignment of unsigned long long to be 8
&
From: Hannes Frederic Sowa
> Sent: 15 December 2016 12:23
...
> Hmm? Even the Intel ABI expects alignment of unsigned long long to be 8
> bytes on 32 bit. Do you question that?
Yes.
The linux ABI for x86 (32 bit) only requires 32bit alignment for u64 (etc).
David
From: Hannes Frederic Sowa
> Sent: 14 December 2016 22:03
> On 14.12.2016 13:46, Jason A. Donenfeld wrote:
> > Hi David,
> >
> > On Wed, Dec 14, 2016 at 10:56 AM, David Laight <david.lai...@aculab.com>
> > wrote:
> >> ...
> >>> +u64
From: Linus Torvalds
> Sent: 15 December 2016 00:11
> On Wed, Dec 14, 2016 at 3:34 PM, Jason A. Donenfeld wrote:
> >
> > Or does your reasonable dislike of "word" still allow for the use of
> > dword and qword, so that the current function names of:
>
> dword really is confusing
From: Behalf Of Jason A. Donenfeld
> Sent: 14 December 2016 18:46
...
> + ret = *chaining = siphash24((u8 *), offsetof(typeof(combined),
> end),
If you make the first argument 'const void *' you won't need the cast
on every call.
I'd also suggest making the key u64[2].
David
--
To
From: Andy Lutomirski
> Sent: 12 December 2016 20:53
> The driver put a constant buffer of all zeros on the stack and
> pointed a scatterlist entry at it in two places. This doesn't work
> with virtual stacks. Use a static 16-byte buffer of zeros instead.
...
I didn't think you could dma from
From: Paulo Flabiano Smorigo
> Sent: 19 July 2016 14:36
> Ignore assembly files generated by the perl script.
...
> diff --git a/drivers/crypto/vmx/.gitignore b/drivers/crypto/vmx/.gitignore
> new file mode 100644
> index 000..af4a7ce
> --- /dev/null
> +++ b/drivers/crypto/vmx/.gitignore
> @@
From: Paulo Flabiano Smorigo
> Sent: 11 July 2016 20:08
>
> This patch add XTS subroutines using VMX-crypto driver.
>
> It gives a boost of 20 times using XTS.
>
> These code has been adopted from OpenSSL project in collaboration
> with the original author (Andy Polyakov ).
From: Sowmini Varadhan
> Sent: 01 December 2015 18:37
...
> I was using esp-null merely to not have the crypto itself perturb
> the numbers (i.e., just focus on the s/w overhead for now), but here
> are the numbers for the stock linux kernel stack
> Gbps peak cpu util
> esp-null
From: Sowmini Varadhan
> Sent: 02 December 2015 12:12
> On (12/02/15 11:56), David Laight wrote:
> > > Gbps peak cpu util
> > > esp-null 1.8 71%
> > > aes-gcm-c-2561.6 79%
> > > aes-ccm-a-1280.7 96%
> > >
&g
From: Christophe Leroy
Linux CodyingStyle recommends to use short variables for local
variables. ptr is just good enough for those 3 lines functions.
It helps keep single lines shorter than 80 characters.
...
-static void to_talitos_ptr(struct talitos_ptr *talitos_ptr, dma_addr_t
dma_addr)
From: Markus Stockhausen
[PATCH v1 1/3] SHA1 for PPC/SPE - assembler
This is the assembler code for SHA1 implementation with
the SIMD SPE instruction set. With the enhanced instruction
set we can operate on 2 32 bit words in parallel. That helps
reducing the time to calculate W16-W79. For
From: Markus Stockhausen
4K AES tables for big endian
I can't help feeling that you could give more information about how the
values are generated.
...
+ * These big endian AES encryption/decryption tables are designed to be
simply
+ * accessed by a combination of rlwimi/lwz instructions
Doesn't this badly overflow W[] ..
+#define SHA512_0_15(i, a, b, c, d, e, f, g, h) \
+ t1 = h + e1(e) + Ch(e, f, g) + sha512_K[i] + W[i]; \
...
+ for (i = 0; i 16; i += 8) {
...
+ SHA512_0_15(i + 7, b, c, d, e, f, g, h, a);
+ }
David
--
To
Trying a dynamic memory allocation, and fallback on a single
pre-allocated bloc of memory, shared by all cpus, protected by a
spinlock
...
-
+ static u64 msg_schedule[80];
+ static DEFINE_SPINLOCK(msg_schedule_lock);
int i;
- u64 *W = get_cpu_var(msg_schedule);
+
29 matches
Mail list logo