Re: [PATCH v5 1/2] crypto: skcipher AF_ALG - overhaul memory management

2017-03-18 Thread Stephan Müller
Am Donnerstag, 16. März 2017, 10:52:48 CET schrieb Herbert Xu:

Hi Herbert,
> 
> First of all you're only limiting the amount of memory occupied
> by the SG list which is not the same thing as the memory pinned
> down by the actual recvmsg.

When considering af_alg_make_sg, the function iov_iter_get_pages is used to 
obtain the user space pages for recvmsg. When looking closely at that 
function, I would understand that the user space pages are pinned into memory 
and made accessible from kernel space. That said, I would infer from the code 
that no kernel-local memory is allocated for the real data. Similar to zero-
copy, pages are shared between kernel and user that will hold the crypto 
operation result (ciphertext for enc or plaintext for dec).

Therefore, I would infer that the memory usage of the discussed code is 
limited to the skcipher_rsgl meta data which is limited by sock_kmalloc. This 
finally would imply that the kernel will not occupy more memory than allocated 
by sock_kmalloc.

Or am I missing something?

Thanks
Stephan


Re: [ANNOUNCE] /dev/random - a new approach (code for 4.11-rc1)

2017-03-18 Thread Stephan Müller
Am Samstag, 18. März 2017, 14:43:18 CET schrieb Jeffrey Walton:

Hi Jeffrey,

> > I am not sure how this statement relates to the quote above. RDSEED is the
> > CBC-MACed output of the flip-flop providing the raw noise.
> > 
> > RDRAND is the output of the SP800-90A CTR DRBG that is seeded by the
> > CBC-MAC that also feeds RDSEED. Thus, RDSEED is as fast as the noise
> > source where RDRAND is a pure deterministic RNG that tries to be
> > (re)seeded as often as possible.
> > 
> > Both instructions are totally unrelated to the SP800-90A DRBG available to
> > the Linux kernel.
> 
> SP800-90A requires an entropy source to bootstrap the Hash, HMAC and
> CTR generators. That is, the Instantiate and Reseed functions need an
> approved source of entropy. Both RDRAND and RDSEED are approved for
> Intel chips. See SP800-90A, Section 8.6.5
> (http://csrc.nist.gov/publications/nistpubs/800-90A/SP800-90A.pdf).

I am aware that SP800-90A makes the claim of having an approved noise source. 
But as of now, there is no such thing.

NIST is aware of that issue. To cover that issue during a FIPS 140-2 
validation, you have to prove your noise sources to be compliant to SP800-90B. 
I performed such noise source assessments as part of the FIPS 140-2 
validations of the Intel Sunrise Point or the Qualcomm HW DRBG FIPS 140-2 
validations. Also, I completed such assessments for the FIPS 140-2 validations 
of the legady /dev/random covering numerous Linux-based cryptographic modules 
over the last couple of years.

To get a glimpse of how such assessments for FIPS 140-2 are conducted, please 
have a look at the assessment [1] section 5.3.2.1 starting on page 72 in the 
lower half (note that I was the main author of this study). To be honest, the 
assessment in [1] section 5.5 was the main motivation for my LRNG 
implementation.

That said, [2] section 3.4.1, starting at page 34 bottom, you see the same 
SP800-90B test approach that was equally accepted by NIST during formal FIPS 
140-2 validations of other noise sources. Hence, I would conclude that my 
suggested implementation of the noise source is appropriate for a DRBG to be 
compliant to section 8.6.5 of SP800-90A.

But you mention a very important topic: is it and how is it ensured that the 
DRBG is appropriately seeded. This issue is discussed in [2] section 2.1 which 
explains the initial, minimal and full seeded stages of the DRBG.

[1] https://www.bsi.bund.de/SharedDocs/Downloads/DE/BSI/Publikationen/Studien/
ZufallinVMS/Randomness-in-VMs.pdf

[2] http://www.chronox.de/lrng/doc/lrng.pdf

Ciao
Stephan


Re: [ANNOUNCE] /dev/random - a new approach (code for 4.11-rc1)

2017-03-18 Thread Jeffrey Walton
>> > The design and implementation is driven by a set of goals described in [2]
>> > that the LRNG completely implements. Furthermore, [2] includes a
>> > comparison with RNG design suggestions such as SP800-90B, SP800-90C, and
>> > AIS20/31.
>>
>> A quick comment about SP800 and the hardware instructions... RDSEED is
>> 2 to 5 times slower than RDRAND on Intel hardware, depending on the
>> architecture and microarchitecture.
>
> I am not sure how this statement relates to the quote above. RDSEED is the
> CBC-MACed output of the flip-flop providing the raw noise.
>
> RDRAND is the output of the SP800-90A CTR DRBG that is seeded by the CBC-MAC
> that also feeds RDSEED. Thus, RDSEED is as fast as the noise source where
> RDRAND is a pure deterministic RNG that tries to be (re)seeded as often as
> possible.
>
> Both instructions are totally unrelated to the SP800-90A DRBG available to the
> Linux kernel.

SP800-90A requires an entropy source to bootstrap the Hash, HMAC and
CTR generators. That is, the Instantiate and Reseed functions need an
approved source of entropy. Both RDRAND and RDSEED are approved for
Intel chips. See SP800-90A, Section 8.6.5
(http://csrc.nist.gov/publications/nistpubs/800-90A/SP800-90A.pdf).

Jeff


Re: [ANNOUNCE] /dev/random - a new approach (code for 4.11-rc1)

2017-03-18 Thread Stephan Müller
Am Samstag, 18. März 2017, 11:11:57 CET schrieb Jeffrey Walton:

Hi Jeffrey,

> > The design and implementation is driven by a set of goals described in [2]
> > that the LRNG completely implements. Furthermore, [2] includes a
> > comparison with RNG design suggestions such as SP800-90B, SP800-90C, and
> > AIS20/31.
> 
> A quick comment about SP800 and the hardware instructions... RDSEED is
> 2 to 5 times slower than RDRAND on Intel hardware, depending on the
> architecture and microarchitecture.

I am not sure how this statement relates to the quote above. RDSEED is the 
CBC-MACed output of the flip-flop providing the raw noise.

RDRAND is the output of the SP800-90A CTR DRBG that is seeded by the CBC-MAC 
that also feeds RDSEED. Thus, RDSEED is as fast as the noise source where 
RDRAND is a pure deterministic RNG that tries to be (re)seeded as often as 
possible.

Both instructions are totally unrelated to the SP800-90A DRBG available to the 
Linux kernel.

> AMD's implementation of RDRAND is
> orders of magnitude slower than Intel's. Testing on an Athlon 845 X4
> (Bulldozer v4) @ 3.5 GHz shows it runs between 4100 and 4500 cycles
> per byte. It works out to be about 1 MiB/s.

Please consider my speed measurements given in [1] section 3.4.6. The DRBG is 
just slightly slower than the ChaCha20 on small block sizes and twice as fast 
on larger block sizes using AES-NI on x86. As the DRBG implementation has no 
relationship to the RDRAND DRBG, I am not sure about your argument.

When I refer to hardware instructions and SP800-90A, I consider the SP800-90A 
DRBG implementation in crypto/drbg.c provided with the kernel crypto API which 
uses the raw AES/SHA cipher implementation provided by the kernel crypto API. 
Here, the implementation uses the fastest one, such as the AES-NI raw AES 
implementation on x86. Or it uses the ARM NEON SHA implementation for the 
HMAC/Hash DRBG.
> 
> While the LRNG may reach a cryptographically acceptable seed level
> much earlier then the existing /dev/random, it may not be early
> enough.

The LRNG will initialize as a DRBG during late_initcall. Thus, the DRBG is 
always present if user space calls.

However, during kernel boot, there is of course a need for earlier randomness. 
This is covered by the init DRNG documented in [1] section 2.10.

> Some components, like systemd, will ask for random numbers and
> truck-on even if they are not available. Systemd does not block or
> wait if get_random_bytes fails to produce. In the bigger picture,
> don't expect that software layered above will do the expected thing in
> all cases.

The LRNG works as a full ABI and API replacement for the current /dev/random 
implementation. I run it on my servers. It delivers random data for all use 
cases, during early kernel and user space boot as well as during runtime.

[1] http://www.chronox.de/lrng/doc/lrng.pdf
> 
> Jeff


Ciao
Stephan


Re: [PATCH] crypto: zip - Memory corruption in zip_clear_stats()

2017-03-18 Thread Dan Carpenter
On Sat, Mar 18, 2017 at 11:24:34AM +0100, walter harms wrote:
> 
> 
> Am 17.03.2017 21:46, schrieb Dan Carpenter:
> > There is a typo here.  It should be "stats" instead of "state".  The
> > impact is that we clear 224 bytes instead of 80 and we zero out memory
> > that we shouldn't.
> > 
> > Fixes: 09ae5d37e093 ("crypto: zip - Add Compression/Decompression 
> > statistics")
> > Signed-off-by: Dan Carpenter 
> > 
> > diff --git a/drivers/crypto/cavium/zip/zip_main.c 
> > b/drivers/crypto/cavium/zip/zip_main.c
> > index 0951e20b395b..6ff13d80d82e 100644
> > --- a/drivers/crypto/cavium/zip/zip_main.c
> > +++ b/drivers/crypto/cavium/zip/zip_main.c
> > @@ -530,7 +530,7 @@ static int zip_clear_stats(struct seq_file *s, void 
> > *unused)
> > for (index = 0; index < MAX_ZIP_DEVICES; index++) {
> > if (zip_dev[index]) {
> > memset(&zip_dev[index]->stats, 0,
> > -  sizeof(struct zip_state));
> > +  sizeof(struct zip_stats));
> 
> 
> as future FIXME some show find a name that differ in more than just the last 
> char.
> NTL maybe
>  sizeof(zip_dev[index]->stats)
> can be used here ?

That's sort of unweildy.  I don't fear that change because I'm confident
I would catch it with static analysis.

regards,
dan carpenter



Re: [PATCH] crypto: zip - Memory corruption in zip_clear_stats()

2017-03-18 Thread walter harms


Am 17.03.2017 21:46, schrieb Dan Carpenter:
> There is a typo here.  It should be "stats" instead of "state".  The
> impact is that we clear 224 bytes instead of 80 and we zero out memory
> that we shouldn't.
> 
> Fixes: 09ae5d37e093 ("crypto: zip - Add Compression/Decompression statistics")
> Signed-off-by: Dan Carpenter 
> 
> diff --git a/drivers/crypto/cavium/zip/zip_main.c 
> b/drivers/crypto/cavium/zip/zip_main.c
> index 0951e20b395b..6ff13d80d82e 100644
> --- a/drivers/crypto/cavium/zip/zip_main.c
> +++ b/drivers/crypto/cavium/zip/zip_main.c
> @@ -530,7 +530,7 @@ static int zip_clear_stats(struct seq_file *s, void 
> *unused)
>   for (index = 0; index < MAX_ZIP_DEVICES; index++) {
>   if (zip_dev[index]) {
>   memset(&zip_dev[index]->stats, 0,
> -sizeof(struct zip_state));
> +sizeof(struct zip_stats));


as future FIXME some show find a name that differ in more than just the last 
char.
NTL maybe
 sizeof(zip_dev[index]->stats)
can be used here ?

re,
 wh

>   seq_printf(s, "Cleared stats for zip %d\n", index);
>   }
>   }
> --
> To unsubscribe from this list: send the line "unsubscribe kernel-janitors" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


Re: [ANNOUNCE] /dev/random - a new approach (code for 4.11-rc1)

2017-03-18 Thread Jeffrey Walton
> The design and implementation is driven by a set of goals described in [2]
> that the LRNG completely implements. Furthermore, [2] includes a
> comparison with RNG design suggestions such as SP800-90B, SP800-90C, and
> AIS20/31.

A quick comment about SP800 and the hardware instructions... RDSEED is
2 to 5 times slower than RDRAND on Intel hardware, depending on the
architecture and microarchitecture. AMD's implementation of RDRAND is
orders of magnitude slower than Intel's. Testing on an Athlon 845 X4
(Bulldozer v4) @ 3.5 GHz shows it runs between 4100 and 4500 cycles
per byte. It works out to be about 1 MiB/s.

While the LRNG may reach a cryptographically acceptable seed level
much earlier then the existing /dev/random, it may not be early
enough. Some components, like systemd, will ask for random numbers and
truck-on even if they are not available. Systemd does not block or
wait if get_random_bytes fails to produce. In the bigger picture,
don't expect that software layered above will do the expected thing in
all cases.

Jeff


Question - seeding the hw pseudo random number generator

2017-03-18 Thread Krzysztof Kozlowski
Hi,

I looked at Exynos Pseudo Random Nubmer Generator driver
(drivers/char/hw_random/exynos-rng.c) and noticed that it always seeds
the device with jiffies.  Then I looked at few other drivers and found
that they do not seed themself (or at least I couldn't find this).

I think the hw_random API does not provide generic infrastructure for
seeding.

What is the preferred approach for seeding a PRNG device? Use jiffies or
a fixed value?

Or maybe the interface should be abandoned in favor of crypto API?

Best regards,
Krzysztof


Re: [ANNOUNCE] /dev/random - a new approach (code for 4.11-rc1)

2017-03-18 Thread Stephan Müller
Am Freitag, 17. März 2017, 16:31:29 CET schrieb Jason A. Donenfeld:

Hi Jason,

> Hey Stephan,
> 
> Have you considered submitting this without so many options? For
> example -- just unconditionally using ChaCha20 instead of the
> configurable crypto API functions? And either removing the FIPS140
> compliance code, and either unconditionally including it, or just
> getting rid of it? And finally just making this a part of the kernel
> directly, instead of adding this as a standalone optional component?

Submitting that with various options removed is no problem as the core concept 
of my implementation is to be flexible to allow folks to easily add new noise 
sources or a DRNG replacement.

Yet, there are reasons for the different options:

- some folks want to use a known good and proven DRNG for post-processing 
(hence the offer for using an SP800-90A DRBG)

- some folks want a DRNG that is not tied to the kernel crypto API (hence the 
offer for the ChaCha20 DRNG)

- some folks need the FIPS continuous self test and some do not.

The idea for the solution in the LRNG code is that a user shall not be 
involved with these compile-time decisions. All options that are present are 
based on other kernel configuration options:

- if the kernel crypto API is present and the SP800-90A DRBG is compiled, then 
the LRNG uses the SP800-90A DRBG

- as a user may compile in different types of the SP800-90A DRBG, the LRNG 
will use the one that is available

- if no kernel crypto API is compiled, it uses the ChaCha20 DRNG

- if FIPS support is not compiled, the FIPS continuous test is not compiled

Thus, a user does not get in touch with all the options in the LRNG code. 
Shouldn't that be a good argument to keep these options?

I would like to add the LRNG code directly to the kernel and I can offer an 
official patch instead of such out-of-tree code. However, in the past I got 
shot down when I suggested an inclusion.

Ciao
Stephan