[whew, actually read the whole thread] On 11 June 2016 at 10:28, Terry Reedy <tjre...@udel.edu> wrote: > On 6/11/2016 11:34 AM, Guido van Rossum wrote: >> >> In terms of API design, I'd prefer a flag to os.urandom() indicating a >> preference for >> - blocking >> - raising an exception >> - weaker random bits > > > +100 ;-) > > I proposed exactly this 2 days ago, 5 hours after Larry's initial post.
No, this is a bad idea. Asking novice developers to make security decisions they're not yet qualified to make when it's genuinely possible for us to do the right thing by default is the antithesis of good security API design, and os.urandom() *is* a security API (whether we like it or not - third party documentation written by the cryptographic software development community has made it so, since it's part of their guidelines for writing security sensitive code in pure Python). Adding *new* APIs is also a bad idea, since "os.urandom() is the right answer on every OS except Linux, and also the best currently available answer on Linux" has been the standard security advice for generating cryptographic secrets in pure Python code for years now, so we should only change that guidance if we have extraordinarily compelling reasons to do so, and we don't. Instead, we have Ted T'so himself chiming in to say: "My preference would be that os.[u]random should block, because the odds that people would be trying to generate long-term cryptographic secrets within seconds after boot is very small, and if you *do* block for a second or two, it's not the end of the world." The *actual bug* that triggered this latest firestorm of commentary (from experts and non-experts alike) had *nothing* to do with user code calling os.urandom, and instead was a combination of: - CPython startup requesting cryptographically secure randomness when it didn't need it - a systemd init script written in Python running before the kernel RNG was fully initialised That created a deadlock between CPython startup and the rest of the Linux init process, so the latter only continued when the systemd watchdog timed out and killed the offending script. As others have noted, this kind of deadlock scenario is generally impossible on other operating systems, as the operating system doesn't provide a way to run Python code before the random number generator is ready. The change Victor made in 3.5.2 to fall back to reading /dev/urandom directly if the getrandom() syscall returns EAGAIN (effectively reverting to the Python 3.4 behaviour) was the simplest possible fix for that problem (and an approach I thoroughly endorse, both for 3.5.2 and for the life of the 3.5 series), but that doesn't make it the right answer for 3.6+. To repeat: the problem encountered was NOT due to user code calling os.urandom(), but rather due to the way CPython initialises its own internal hash algorithm at interpreter startup. However, due to the way CPython is currently implemented, fixing the regression in that not only changed the behaviour of CPython startup, it *also* changed the behaviour of every call to os.urandom() in Python 3.5.2+. For 3.6+, we can instead make it so that the only things that actually rely on cryptographic quality randomness being available are: - calling a secrets module API - calling a random.SystemRandom method - calling os.urandom directly These are all APIs that were either created specifically for use in security sensitive situations (secrets module), or have long been documented (both within our own documentation, and in third party documentation, books and Q&A sites) as being an appropriate choice for use in security sensitive situations (os.urandom and random.SystemRandom). However, we don't need to make those block waiting for randomness to be available - we can update them to raise BlockingIOError instead (which makes it trivial for people to decide for themselves how they want to handle that case). Along with that change, we can make it so that starting the interpreter will never block waiting for cryptographic randomness to be available (since it doesn't need it), and importing the random module won't block waiting for it either. To the best of our knowledge, on all operating systems other than Linux, encountering the new exception will still be impossible in practice, as there is no known opportunity to run Python code before the kernel random number generator is ready. On Linux, init scripts may still run before the kernel random number generator is ready, but will now throw an immediate BlockingIOError if they access an API that relies on crytographic randomness being available, rather than potentially deadlocking the init process. Folks encountering that situation will then need to make an explicit decision: - loop until the exception is no longer thrown - switch to reading from /dev/urandom directly instead of calling os.urandom() - switch to using a cross-platform non-cryptographic API (probably the random module) Victor has some additional technical details written up at http://haypo-notes.readthedocs.io/pep_random.html and I'd be happy to formalise this proposed approach as a PEP (the current reference is http://bugs.python.org/issue27282 ) Regards, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com