Fabio Scotoni <[email protected]> wrote:
> As far as I can tell, all of the calls to memset(3) in
> lib/libc/crypt/arc4random.c are intended to wipe memory to avoid having
> the randomly generated data in memory twice, so it would seem good
> practice to use explicit_bzero(3) to avoid this being optimized out.
please clarify under what rules would those memset's be optimized out
> Index: lib/libc/crypt/arc4random.c
> ===================================================================
> RCS file: /cvs/src/lib/libc/crypt/arc4random.c,v
> retrieving revision 1.55
> diff -u -p -r1.55 arc4random.c
> --- lib/libc/crypt/arc4random.c 24 Mar 2019 17:56:54 -0000 1.55
> +++ lib/libc/crypt/arc4random.c 19 Dec 2019 12:51:23 -0000
> @@ -98,7 +98,7 @@ _rs_stir(void)
>
> /* invalidate rs_buf */
> rs->rs_have = 0;
> - memset(rsx->rs_buf, 0, sizeof(rsx->rs_buf));
> + explicit_bzero(rsx->rs_buf, sizeof(rsx->rs_buf));
rsx is global scope, non-static, the compiler has no justification for
eliding a bzero since other external code has visibility on the buffer
> rs->rs_count = 1600000;
> }
> @@ -119,7 +119,7 @@ static inline void
> _rs_rekey(u_char *dat, size_t datlen)
> {
> #ifndef KEYSTREAM_ONLY
> - memset(rsx->rs_buf, 0, sizeof(rsx->rs_buf));
> + explicit_bzero(rsx->rs_buf, sizeof(rsx->rs_buf));
same here
> #endif
> /* fill rs_buf with the keystream */
> chacha_encrypt_bytes(&rsx->rs_chacha, rsx->rs_buf,
> @@ -134,7 +134,7 @@ _rs_rekey(u_char *dat, size_t datlen)
> }
> /* immediately reinit for backtracking resistance */
> _rs_init(rsx->rs_buf, KEYSZ + IVSZ);
> - memset(rsx->rs_buf, 0, KEYSZ + IVSZ);
> + explicit_bzero(rsx->rs_buf, KEYSZ + IVSZ);
same here
> rs->rs_have = sizeof(rsx->rs_buf) - KEYSZ - IVSZ;
> }
>
> @@ -152,7 +152,7 @@ _rs_random_buf(void *_buf, size_t n)
> keystream = rsx->rs_buf + sizeof(rsx->rs_buf)
> - rs->rs_have;
> memcpy(buf, keystream, m);
> - memset(keystream, 0, m);
> + explicit_bzero(keystream, m);
keystream is a piece of rsx, same argument as above
> buf += m;
> n -= m;
> rs->rs_have -= m;
> @@ -172,7 +172,7 @@ _rs_random_u32(uint32_t *val)
> _rs_rekey(NULL, 0);
> keystream = rsx->rs_buf + sizeof(rsx->rs_buf) - rs->rs_have;
> memcpy(val, keystream, sizeof(*val));
> - memset(keystream, 0, sizeof(*val));
> + explicit_bzero(keystream, sizeof(*val));
and same here
> rs->rs_have -= sizeof(*val);
> }
>
there is no voodoo involved in the "compiler deletes memset" optimization.
it only happens when it can prove there are no other viewers. this is a
clearcut case that there are other viewers. therefore we don't need the
function-call non-optimized overhead of explicit_bzero.