Re: Bug#760857: mksh shoould not export $RANDOM

Lorenzo Sat, 13 Sep 2014 09:48:25 -0700

On 09/12/2014 10:02 PM, Thorsten Glaser wrote:

tl;dr: We probably should simplify the code (no promises
about $RANDOM other than its value area) and not export
$RANDOM any more, and only use arc4random-related functions
where really convenient, “lesser” OSes are SOL. We move
the task to get better random numbers on the script writers
(and provide a sample implementation already), except on
MirBSD, where it is convenient ☺ (but also probably only
useful to replace dice rolls to decide on what to have
for lunch that day).



Lorenzo dixit:

I was wondering if there would be any trouble replacing the lcg with a
generator whose state is __made__ of n>1 words - something like, off the top of
my head, xor128, JKISS, you name it.


Something I really want is a sponge construct, like Keccak, but
one where you can constantly write to and read from.

I really wish I could tell you something meaningful, but all I know isthat it won sha3... iirc comments on keckak I can't imagine how youthink chacha20 is complex and sponge functions aren't ;p but maybe thosecomments were wrong, I *really* don't know (and didn't look into spongefunctions because of them).


Since this is out of the scope of mksh, I’m somewhat tempted to
remove the export and bring back the state we once had:

• if arc4random() exists on the system, always use that
• if arc4random() does not exist, just use an LCG with few
   extra seeding (stack address, etc.) or none at all (since
   the OSes without it likely also don’t have ASLR and so)

This has several downsides:

• we used to have set ±o arc4random
   ‣ one variant: constant, to see which one is in use
     ⇒ led to the LCG codepath being untested/buggy
   ‣ other variant: can enable/disable arc4random
     ⇒ code bloat, not much benefit
• if using arc4random(), we use arc4random_addrandom() to
   feed assignments to $RANDOM back, but OpenBSD removed
   the interface…
• we used to ship one, which wrote to /dev/urandom to feed
   back to the kernel, but the Gentoo Linux people didn’t
   like that

Found the patch, no idea what bothered them; I guess they just didn'tknow you can safely write to /dev/random and urandom on linux.

Since they complain, an even simpler implementation is a NOP :)

• using arc4random() on systems where it’s not in libc will
   make it the packager’s choice, which we don’t like much

But we can just bite the bullet here and say “we use arc4random
if you have it, and otherwise you get something that always
produces the same output sequence for the same assignment to
RANDOM, and it’s your fault”. This a̲l̲s̲o̲ has a downside, namely
people expecting deterministic output again, if they program
and test on a “lesser” OS (one without arc4random), and we just
use arc4random_pushb_fast() macro on MirBSD for pushback, and
if it doesn’t exist (OpenBSD, Linux – also sorta lesser OSes)
there is no pushback…

"Remove arc4random_stir() and arc4random_addrandom(), which none should
be using directly.  Well, a few rare people cloned it upstream and it
will take a bit of time for them to learn.
ok various"

Do you know what's linux does better than openbsd? rationale for commitsgoes in the commit message rather than just being mentioned...


As for “better” LCG-ish things: nah, it’s either cryptographically
sound (aRC4) or speed/convenience.

ACK.


I think we’d be best off with an mksh not promising anything
about the quality of its $RANDOM, and using arc4random() only
where it is really convenient (e.g. on OpenBSD and MirBSD, the
libc malloc() uses it already anyway). We kinda don’t promise
anything in the manpage already… and it would shrink the code.

I'll look into the cvs equivalent of "git log -p" when I have the time,but a glance at mksh's history suggests that it used to havearc4random.c - was it really that painful to port?

Btw, agreed, arc4random should be everywhere.


Right. But I have a pure shell implementation for when people
really want it… example use:

https://www.teckids.org/gitweb/?p=verein.git;a=blob;f=util/projrand;h=80f3210cf77314c630086def1062958a528414ab;hb=HEAD

(Watch that space. I also already have the idea to add the
timing of the “Glücksfee” (person doing the lottery drawing
by hitting Return occasionally) to the arcfour state…)

I didn't know urandom in linux had these kind of problems, but


It does…

Any reference? I knew about the insanity of spitting out "random" bytesbefore being fully seeded, and I know (I compared myself) that it'sslower than other implementations, but what's the problem with gettingrandom bytes out of /dev/urandom?

mksh -c for i in $(seq 1000) ; do head /dev/urandom||exit; done' | wc -cruns in just above one second here (on linux) and outputs ~2.44MB;/proc/sys/kernel/random/entropy_avail outputs 912 after running theabove a few times - did I miss anything?

modern arc4random uses chacha20, which only requires 16/32 bytes for
its keys.


N̲O̲T̲ “modern” but “OpenBSD’s latest”. This is not “modern”,
it just follows a worrisome trend – not only is DJB’s code
basically illegible (coding style) and incomprehensible to
non-mathematicians, but also unlicenced software and thus
violating http://www.openbsd.org/policy.html (especially
the last two paragraphs), as it both is not ultimately clear
whether DJB’s code is really in PD in the USA, and it most
certainly is not in PD in most other countries.

Except for the trend about complex code (I'm running systemd right now!)I have to disagree on just about everything.


= Worrisome trend

Look at the papers describing chacha20/salsa20 (chacha20 is described as"salsa20 with these changes"); I'm not a mathematician, but the papersare __really__ readable, they do a lot to explain the design decisionsand just about any developer could write a C implementation of chachaafter reading them - seriously, try it, he literally describes how toimplement the algorithm bottom up, and I bet you'll end up with codethat strongly resembleshttp://cr.yp.to/streamciphers/timings/estreambench/submissions/salsa20/chacha8/ref/chacha.c(notice that the api was chosen by the eSTREAM competition).

The code used in openbsd's chacha is basically the above with only twochanges:1) it's optimized (by djb) in the __obvious__ way, ie he replaces "u32x[16],input[16]" with "u32 x0,x1,..." - the speed difference is muchbigger than I thought!

You pay it with more lines, but they are boringly obvious...

2) they (openbsd) added "#ifndef KEYSTREAM_ONLY" - the original xors theplaintext, and gives out the keystream by xoring with zero; sincetheyonly want the keystream...

Also some of the choices are so insanely good that even I can appreciatethem - eg remember how one of rc4's weak points is the lousy keyschedule, and you need to skip n*256 bytes? Compare chacha20's keyschedule and make sure your jaw doesn't drop...


Don't even get me started on security ;)


= Unlicensed software

Look athttp://openbsd.cs.toronto.edu/cgi-bin/cvsweb/src/lib/libc/crypt/chacha_private.h,

/*
chacha-merged.c version 20080118
D. J. Bernstein
Public domain.
*/

I know that djb used to screw up licenses in a glamorous way, butluckily he changed his mind a few years ago, seehttp://cr.yp.to/publicdomain.html and http://cr.yp.to/distributors.html- he changed the old troublesome "license" and got at least a copyrightlawyer insisting he didn't to admit that he was wrong :)


(DJB could fix that easily – others do – but refuses to
even acknowledge the problem we non-US-Americans have.
But then, his track record wrt. software licencing is
pretty… dirty.)

Anyway, I do not discuss upstream things like this in the
Debian bugtracker, as this is most definitely not a bug
in the package.

You're totally right, sending emails is too easy - sorry :)


Thanks for agreeing, and sorry for taking a bit to respond.

bye,
//mirabilos

Np, good answers are better than fast answers (I'm often guilty ofanswering too fast and regret it later).

Re: Bug#760857: mksh shoould not export $RANDOM

Reply via email to