On Tue, Apr 30, 2019 at 10:51 AM Reshetova, Elena
<[email protected]> wrote:
> base: Simple syscall: 0.1761 microseconds
> get_random_bytes (4096 bytes per-cpu buffer): 0.1793 microsecons
> get_random_bytes (64 bytes per-cpu buffer): 0.1866 microsecons

The 4096 size seems pretty good.

> Below is a snip of what I quickly did (relevant parts) to get these numbers.
> I do initial population of per-cpu buffers in late_initcall, but
> practice shows that rng might not always be in good state by then.
> So, we might not have really good randomness then, but I am not sure
> if this is a practical problem since it only applies to system boot and by
> the time it booted, it already issued enough syscalls that buffer gets 
> refilled
> with really good numbers.
> Alternatively we can also do it on the first syscall that each cpu gets, but I
> am not sure if that is always guaranteed to have a good randomness.

Populating at first syscall seems like a reasonable way to delay. And
I agree: I think we should not be too concerned about early RNG state:
we should design for the "after boot" behaviors.

> diff --git a/lib/percpu-random.c b/lib/percpu-random.c
> new file mode 100644
> index 000000000000..3f92c44fbc1a
> --- /dev/null
> +++ b/lib/percpu-random.c
> @@ -0,0 +1,49 @@
> +#include <linux/types.h>
> +#include <linux/percpu.h>
> +#include <linux/random.h>
> +
> +static DEFINE_PER_CPU(struct rnd_buffer, stack_rand_offset) __latent_entropy;
> +
> +
> +/*
> + *    Generate some initially weak seeding values to allow
> + *    to start the prandom_u32() engine.
> + */
> +static int __init stack_rand_offset_init(void)
> +{
> +    int i;
> +
> +    /* exctract bits to out per-cpu rand buffers */
> +    for_each_possible_cpu(i) {
> +        struct rnd_buffer *buffer = &per_cpu(stack_rand_offset, i);
> +        buffer->byte_counter = 0;
> +        /* if rng is not initialized, this won't extract us good stuff
> +         * but we cannot wait for rng to initialize either */
> +        get_random_bytes(&(buffer->buffer), sizeof(buffer->buffer));

Instead of doing get_random_bytes() here, just set byte_counter =
RANDOM_BUFFER_SIZE and let random_get_byte() do the work on a per-cpu
basis?

> +
> +    }
> +
> +    return 0;
> +}
> +late_initcall(stack_rand_offset_init);
> +
> +unsigned char random_get_byte(void)
> +{
> +    struct rnd_buffer *buffer = &get_cpu_var(stack_rand_offset);
> +    unsigned char res;
> +
> +    if (buffer->byte_counter >= RANDOM_BUFFER_SIZE) {
> +        get_random_bytes(&(buffer->buffer), sizeof(buffer->buffer));
> +        buffer->byte_counter = 0;
> +    }
> +
> +    res = buffer->buffer[buffer->byte_counter];
> +    buffer->buffer[buffer->byte_counter] = 0;
> +    buffer->byte_counter ++;
> +     put_cpu_var(stack_rand_offset);
> +    return res;
> +}
> +EXPORT_SYMBOL(random_get_byte);

Otherwise, sure, looks good. I remain worried about info leaks of the
percpu area causing pain down the road, but we find a safer way to do
this, we can do it later.

-- 
Kees Cook

Reply via email to