Re: [Python-Dev] Our responsibilities (was Re: BDFL ruling request: should we block forever waiting for high-quality random bits?)

2016-06-16 Thread Theodore Ts'o
On Thu, Jun 16, 2016 at 03:24:33PM +0300, Barry Warsaw wrote:
> Except that I disagree.  I think os.urandom's original intent, as documented
> in Python 3.4, is to provide a thin layer over /dev/urandom, with all that
> implies, and with the documented quality caveats.  I know as a Linux developer
> that if I need to know the details of that, I can `man urandom` and read the
> gory details.  In Python 3.5, I can't do that any more.

If Python were to document os.urandom as providing a thin wrapper over
/dev/urandom as implemented on Linux, and also document os.getrandom
as providing a thin wrapper over getrandom(2) as implemented on Linux.
And then say that the best emulation of those two interfaces will be
provided say that on other operating systems, and that today the best
practice is to call getrandom with the flags set to zero (or defaulted
out), that would certainly make me very happy.

I could imagine that some people might complain that it is too
Linux-centric, or it is not adhering to Python's design principles,
but it makes a lot sense of me as a Linux person.  :-)

Cheers,

- Ted
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?

2016-06-15 Thread Theodore Ts'o
On Wed, Jun 15, 2016 at 04:12:57PM -0700, Nathaniel Smith wrote:
> - It's not exactly true that the Python interpreter doesn't need
> cryptographic randomness to initialize SipHash -- it's more that
> *some* Python invocations need unguessable randomness (to first
> approximation: all those which are exposed to hostile input), and some
> don't. And since the Python interpreter has no idea which case it's
> in, and since it's unacceptable for it to break invocations that don't
> need unguessable hashes, then it has to err on the side of continuing
> without randomness. All that's fine.

In practice, those Python ivocation which are exposed to hostile input
are those that are started while the network are up.  The vast
majority of time, they are launched by the web brwoser --- and if this
happens after a second or so of the system getting networking
interrupts, (a) getrandom won't block, and (b) /dev/urandom and
getrandom will be initialized.

Also, I wish people would say that this is only an issue on Linux.
Again, FreeBSD's /dev/urandom will block as well if it is
uninitialized.  It's just that in practice, for both Linux and
Freebsd, we try very hard to make sure /dev/urandom is fully
initialized by the time it matters.  It's just that so far, it's only
on Linux when there was an attempt to use Python in the early init
scripts, and in a VM and in a system where everything is modularized
such that the deadlock became visible.


> (I guess the way to implement this would be for the SipHash
> initialization code -- which runs very early -- to set some flag, and
> then we expose that flag in sys._something, and later in the startup
> sequence check for it after the warnings module is functional.
> Exposing the flag at the Python level would also make it possible for
> code like cloud-init to do its own explicit check and respond
> appropriately.)

I really don't think it's that big a of a deal in *practice*, and but
if you really are concerned about the very remote possibility that a
Python invocation could start in early boot, and *then* also stick
around for the long term, and *then* be exosed to hostile input ---
what if you set the flag, and then later on, N minutes, either
automatically, or via some trigger such as cloud-init --- try and see
if /dev/urandom is initialized (even a few seconds later, so long as
the init scripts are hanging, it should be initialized) have Python
hash all of its dicts, or maybe just the non-system dicts (since those
are presumably the ones mos tlikely to be exposed hostile input).

  - Ted
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?

2016-06-13 Thread Theodore Ts'o
On Sun, Jun 12, 2016 at 06:53:54PM -0700, Nathaniel Smith wrote:
> 
> Speaking of full-stack perspectives, would it affect your decision if
> Debian Stretch were made robust against blocking /dev/urandom on
> AWS/GCE? Because I think we could find lots of people who would be
> overjoyed to fix Stretch before the next merge window even opens
> (AFAICT the quick fix is literally a 1 line patch), if that allowed
> the blocking /dev/urandom patches to go in upstream...

Alas, it's not just Debian.  Apparently it breaks the boot on Openwrt
as well as Ubuntu Quantal:

https://lkml.org/lkml/2016/6/13/48
https://lkml.org/lkml/2016/5/31/599

(Yay for an automated test infrastructure that fires off as soon as
you push to an externally visible git repository.  :-)

I haven't investigated to see exactly *why* it's blowing up on these
userspace setups, but it's a great reminder for why changing an
established interface is something that has to be done very carefully
indeed.

   - Ted
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?

2016-06-12 Thread Theodore Ts'o
On Sun, Jun 12, 2016 at 09:01:09PM +0100, Cory Benfield wrote:
> My problem with /dev/urandom is that it’s a trap, lying in wait for
> someone who doesn’t know enough about the problem they’re solving to
> step into it.

And my answer to that question is absent backwards compatibility
concerns, use getrandom(2) on Linux, or getentropy(2) on *BSD, and be
happy.  Don't use /dev/urandom; use getrandom(2) instead.  That way
you also solve a number of other problems such as the file descriptor
DOS attack issue, etc.

The problem with Python is that you *do* have backwards compatibility
concerns.  At which point you are faced with the same issues that we
are in the kernel; except I gather than that the commitment to
backwards compatibility isn't quite as absolute (although it is
strong).  Which is why I've been trying very hard not to tell
python-dev what to do, but rather to give you folks the best
information I can, and then encouraging you to do whatever seems most
"Pythony" --- which might or might not be the same as the decisions
we've made in the kernel.

Cheers,

- Ted

P.S.  BTW, I probably won't change the behaviour of /dev/urandom to
make it be blocking.  Before I found out about Pyhton Bug #26839, I
actually had patches that did make /dev/urandom blocking, and they
were planned to for the next kernel merge window.  But ultimately, the
reason why I won't is because there is a set of real users (Debian
Stretch users on Amazon AWS and Google GCE) for which if I changed how
/dev/urandom worked, then I would be screwing them over, even if
Python 3.5.2 falls back to /dev/urandom.  It's not a problem for bare
metal hardware and cloud systems with virtio-rng; I have patches that
will take care of those scenarios.

Unfortunately, both AWS and GCE don't support virtio-rng currently,
and as much as some poeple are worried about the hypothetical problems
of stupidly written/deployed Python scripts that try to generate
long-term secrets during early boot, weighed against the very real
prospect of user lossage on two of the most popular Cloud environments
out there --- it's simply no contest.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?

2016-06-12 Thread Theodore Ts'o
On Sun, Jun 12, 2016 at 11:07:22AM -0700, Nathaniel Smith wrote:
> But for example, if a process is actively blocked waiting
> for the initial entropy, one could spawn a kernel thread that keeps the
> system from quiescing by attempting to scrounge up entropy as fast as
> possible, via whatever mechanisms are locally appropriate (e.g. doing a
> busy-loop racing two clocks against each other, or just scheduling lots of
> interrupts -- which I guess is the same thing, more or less).

There's a lot of snake oil, or at least, hand waving, that goes on
with respect to what will actually work to gather randomness.  One of
the worst possible choices is a standard, kernel-defined workload that
tries to just busy loop two clocks against each other.  For one thing,
on many embedded systems, all of your clocks are generated off of a
single master oscillator anyway.  And in early boot, it's not
realistic for the kernel to be able to measure network interrupt
timings and radio strength indicators from the WiFi, which ultimately
is going to be much more likely to be unpredictable by an outside
attacker sitting in Fort Meade than pretending that you can just
"schedule lots of interrupts".

Again, part of the problem here is that if you really want to be
secure, it needs to be a full stack perspective, where the hardware
designers, the OS developers, and the application level developers are
all working together.  If one side tries to exert a strong "somebody
else's problem field", it's very likely the end solution isn't going
to be secure.  Because in many cases this is simply not practical, we
all have to make assumptions at the OS and C-Python interpreter level,
and hope that the assumptions that we make are are conservative
enough.

> Is this an approach that you've considered?

Ultimately, the arguments made by approaches such as Jitterbug are, to
put it succiently and perhaps a little unfairly, "gee whillikers, the
Intel L1/L2 cache hierarchy is really complicated and it's a closed
hardware implementation so no one can understand it, and besides, the
statistical analysis of the output looks good".

To which I would say, "the first argument is an argument of security
through ignorance", and "AES(NSA_KEY, COUNTER++)" also has really
great statistical results, and if you don't know the NSA_KEY, it will
look very strong and as far as we know, we wouldn't be able to
distinguish it from truly secure random number generator --- but it
really isn't secure.

So yeah, I don't buy it.  In order for it to be secure, we need to be
grabbing measurements which can't be replicated or determined by a
remote attacker.  So having the kernel kick off a kernel thread is not
going to be useful unless we can mix in entropy from the user, or the
workload, or the local configuration, or from the local environment.
(Using RSSI is helpful because the remote attacker might not know
whether your mobile handset is in the knapsack under the table, or on
the desk, and that will change the RSSI numbers.)  Remember, the whole
*point* of modern CPU designs is that the huge amounts of engineering
effort is put into making the CPU be predictable, and so spawning a
kernel thread in isolation isn't going perform magic in terms of
getting guaranteed unpredictability.

> FWIW, the systemd thing is a red herring -- this was debian's configuration
> of a particular daemon that is not maintained by the systemd project, and
> the exact same thing would have happened with sysvinit if debian had tried
> using python 3.5 early in their rcS.

It's not a daemon.  It's the script in
/lib/systemd/system-generators/systemd-crontab-generator, and it's
needed because systemd subsumed the cron daemon, and developers who
wanted to not break user's existing crontab files turned to it.  I
suppose you are technically correct that it is not mainained by
systemd, but the need for it was generated out of systemd's lack of
concern of backwards compatibility.

Because FreeBSD and Mac OS are not using systemd, they are not likely
to run into this problem.  I will grant that if they decided to try to
run a python script out of their /etc/rc script, they would run into
the same problem.

- Ted
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?

2016-06-12 Thread Theodore Ts'o
On Sun, Jun 12, 2016 at 11:40:58AM +0100, Cory Benfield wrote:
> 
> Of this set, only cloud-init worries me, and it worries me for the
> *opposite* reason that Guido and Larry are worried. Guido and Larry
> are worried that programs like cloud-init will be delayed by two
> minutes while they wait for entropy: that’s an understandable
> concern. I’m much more worried that programs like cloud-init may
> attempt to establish TLS connections or create keys during this two
> minute window, leaving them staring down the possibility of
> performing “secure” actions with insecure keys.

There are patches in the dev branch of:

 https://git.kernel.org/cgit/linux/kernel/git/tytso/random.git/

which will automatically use virtio-rng (if it is provided by the
cloud provider) to initialize /dev/urandom.  It also uses a much more
aggressive mechanism to initialize the /dev/urandom pool, so that
getrandom(2) will block for a much shorter period of time immediately
after boot time on real hardware.  I'm confident it's secure for x86
platforms.  I'm still thinking about whether I should fall back to
something more conservative for crappy embedded processors that don't
have a cycle counter or an CPU-provided RDRAND-like instruction.
Related to this is whether I should finally make the change so that
/dev/urandom will block until it is initialized.  (This would make
Linux work like FreeBSD, which *will* also block if its entropy pool
is not initialized.)

> This is why I advocate, like Donald does, for having *some* tool in
> Python that allows Python programs to crash if they attempt to
> generate cryptographically secure random bytes on a system that is
> incapable of providing them (which, in practice, can only happen on
> Linux systems).

Well, it can only happen on Linux because you insist on falling back
to /dev/urandom --- and because other OS's have the good taste not to
use systemd and/or Python very early in the boot process.  If someone
tried to run a python script in early FreeBSD init scripts, it would
block just as you were seeing on Linux --- you just haven't seen that
yet, because arguably the FreeBSD developers have better taste in
their choice of init scripts than Red Hat and Debian.  :-)

So the question is whether I should do what FreeBSD did, which will
statisfy those people who are freaking out and whinging about how
Linux could allow stupidly written or deployed Python scripts get
cryptographically insecure bytes, by removing that option from Python
developers.  Or should I remove that one line from changes in the
random.git patch series, and allow /dev/urandom to be used even when
it might be insecure, so as to satisfy all of the people who are
freaking out and whinging about the fact that a stupildly written
and/or deployed Python script might block during early boot and hang a
system?

Note that I've tried to do what I can to make the time that
/dev/urandom might block as small as possible, but at the end of the
day, there is still the question of whether I should remove the choice
re: blocking from userspace, ala FreeBSD, or not.  And either way,
some number of people will be whinging and freaking out.  Which is why
I completely sympathetic to how Guido might be getting a little
exasperated over this whole thread.  :-)

- Ted
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?

2016-06-12 Thread Theodore Ts'o
On Sun, Jun 12, 2016 at 01:49:34AM -0400, Random832 wrote:
> > The intention behind getrandom() is that it is intended *only* for
> > cryptographic purposes. 
> 
> I'm somewhat confused now because if that's the case it seems to
> accomplish multiple unrelated things. Why was this implemented as a
> system call rather than a device (or an ioctl on the existing ones)? If
> there's a benefit in not going through the non-atomic (and possibly
> resource limited) procedure of acquiring a file descriptor, reading from
> it, and closing it, why is that benefit not also extended to
> non-cryptographic users of urandom via allowing the system call to be
> used in that way?

This design was taken from OpenBSD, and the goal with getentropy(2)
(which is also designed only for cryptographic use cases), was so that
a denial of service attack (fd exhaustion) could force an application
to fall back to a weaker -- in some cases, very weak or non-existent
--- source of randomness.

Non-cryptographic users don't need to use this interface at all.  They
can just use srandom(3)/random(3) and be happy.

> > Anyway, if you don't need cryptographic guarantees, you don't need
> > getrandom(2) or getentropy(2); something like this will do just fine:
> 
> Then what's /dev/urandom *for*, anyway?

/dev/urandom is a legacy interface.  It was intended originally for
cryptographic use cases, but it was intended for the days when very
few programs needed a secure cryptographic random generator, and it
was assumed that application programmers would be very careful in
checking error codes, etc.

It also dates back to a time when the NSA was still pushing very hard
for cryptographic export controls (hence the use of SHA-1 versus an
encryption algorithm) and when many people questioned whether or not
the SHA-1 algorithm, as designed by the NSA, had a backdoor in it.
(As it turns out, the NSA put a back door into DUAL-EC, so retrospect
this concern really wasn't that unreasonable.)  Because of those
concerns, the assumption is those few applications who really wanted
to get security right (e.g., PGP, which still uses /dev/random for
long-term key generation), would want to use /dev/random and deal with
entropy accounting, and asking the user to type randomness on the
keyboard and move their mouse around while generating a random key.

But times change, and these days people are much more likely to
believe that SHA-1 is in fact cryptographically secure, and future
crypto hash algorithms are designed by teams from all over the world
and NIST/NSA merely review the submissions (along with everyone else).
So for example, SHA-3 was *not* designed by the NSA, and it was
evaluated using a much more open process than SHA-1.

Also, we have a much larger set of people writing code which is
sensitive to cryptographic issues (back when I wrote /dev/random, I
probably had met, or at least electronically corresponded with a large
number of the folks who were working on network security protocols, at
least in the non-classified world), and these days, there is much less
trust that people writing code to use /dev/[u]random are in fact
careful and competent security engineers.  Whether or not this is a
fair concern or not, it is true that there has been a change in API
design ethos away from the "Unix let's make things as general as
possible, in case someone clever comes up use case we didn't think
of", to "idiots are ingenious so they will come up with ways to misuse
an idiot-proof interface, so we need to lock it down as much as
possible."  OpenBSD's getentropy(2) interface is a strong example of
this new attitude towards API design, and getrandom(2) is not quite so
doctrinaire (I added a flags field when getentropy(2) didn't even give
those options to progammers), but it is following in the same tradition.

Cheers,

- Ted
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?

2016-06-12 Thread Theodore Ts'o
On Sat, Jun 11, 2016 at 05:46:29PM -0400, Donald Stufft wrote:
> 
> It was a RaspberryPI that ran a shell script on boot that called
> ssh-keygen.  That shell script could have just as easily been a
> Python script that called os.urandom via
> https://github.com/sybrenstuvel/python-rsa instead of a shell script
> that called ssh-keygen.

So I'm going to argue that the primary bug was in the how the systemd
init scripts were configured.  In generally, creating keypairs at boot
time is just a bad idea.  They should be created lazily, in a
just-in-time paradigm.

Consider that if you assume that os.urandom can block, this isn't
necessarily going to do the right thing either --- if you use
getrandom and it blocks, and it's part of a systemd unit which is
blocking futher boot progress, then the system will hang for 90
seconds, and while it's hanging, there won't be any interrupts, so the
system will be dead in the water, just like the orignal bug report
complaining that Python was hanging when it was using getrandom() to
initialize its SipHash.

At which point there will be another bug complaining about how python
was causing systemd to hang for 90 seconds, and there will be demand
to make os.random no longer block.  (Since by definition, systemd can
do no wrong; it's always other programs that have to change to
accomodate systemd.  :-)

So some people will freak out when the keygen systemd unit hangs,
blocking the boot --- and other people will freak out of the systemd
unit doesn't hang, and you get predictable SSH keys --- and some wiser
folks will be asking the question, why the *heck* is it not
openssh/systemd's fault for trying to generate keys this early,
instead of after the first time sshd needs host ssh keys?  If you wait
until the first time the host ssh keys are needed, then the system is
fully booted, so it's likely that the entropy will be collected -- and
even if it isn't, networking will already be brought up, and the
system will be in multi-user mode, so entropy will be collected very
quickly.

Sometimes, we can't solve the problem at the Python level or at the
Kernel level.  It will require security-saavy userspace/application
programmers as well.

Cheers,

- Ted
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?

2016-06-11 Thread Theodore Ts'o
On Fri, Jun 10, 2016 at 05:14:50PM -0400, Random832 wrote:
> On Fri, Jun 10, 2016, at 15:54, Theodore Ts'o wrote:
> > So even on Python pre-3.5.0, realistically speaking, the "weakness" of
> > os.random would only be an issue (a) if it is run within the first few
> > seconds of boot, and (b) os.random is used to directly generate a
> > long-term cryptographic secret.  If you are fork openssl or ssh-keygen
> > to generate a public/private keypair, then you aren't using os.random.
> 
> So, I have a question. If this "weakness" in /dev/urandom is so
> unimportant to 99% of situations... why isn't there a flag that can be
> passed to getrandom() to allow the same behavior?

The intention behind getrandom() is that it is intended *only* for
cryptographic purposes.  For that use case, there's no point having a
"return a potentially unseeded cryptographic secret" option.  This
makes this much like FreeBSD's /dev/random and getentropy system
calls.

(BTW, I've seen an assertion on this thread that FreeBSD's
getentropy(2) never blocks.  As far as I know, this is **not** true.
FreeBSD's getentropy(2) works like its /dev/random device, in that if
it is not fully seeded, it will block.  The only reason why OpenBSD's
getentropy(2) and /dev/random devices will never block is because they
only support architectures where they can make sure that entropy is
passed from a previous boot session to the next, given specialized
bootloader support.  Linux can't do this because we support a very
large number of bootloaders, and the bootloaders are not under the
kernel developers' control.  Fundamentally, you can't guarantee both
(a) that your RNG will never block, and (b) will always be of high
cryptographic quality, in a completely general sense.  You can if you
make caveats about your hardware or when the code runs, but that's
fundamentally the problem with the documentation of os.urandom(); it's
making promises which can't be true 100% of the time, for all
hardware, operating environments, etc.)

Anyway, if you don't need cryptographic guarantees, you don't need
getrandom(2) or getentropy(2); something like this will do just fine:

long getrand()
{
static int initialized = 0;
struct timeval tv;

if (!initialized) {
gettimeofday(, NULL);
srandom(tv.tv_sec ^ tv.tv_usec ^ getpid());
initialized++;
}
return random();
}

So this is why I did what I did.  If Python decides to go down this
same path, you could define a new interface ala getrandom(2), which is
specifically designed for cryptogaphic purposes, and perhaps a new,
more efficient interface for those people who don't need cryptogaphic
guarantees --- and then keep the behavior of os.urandom consistent
with Python 3.4, but update the documentation to reflect the reality.

Alternatively, you could keep the implementation of os.urandom
consistent with Python 3.5, and then document that under some
circumstances, it will block.  Both approaches have certain tradeoffs,
but it's not going to be the end of the world regardless of which way
you decide to go.   I'd suggest that you use your existing mechanisms
to decide on which approach is more Pythony, and then make sure you
communicate and over-communicate it to your user/developer base.

And then --- relax.  It may seem like a big deal today, but in a year
or so people will have gotten used to whatever interface or
documentation changes you decide to make, and it will be all fine.
As Dame Julian of Norwich once said, "All shall be well, and all shall
be well, and all manner of things shall be well."  

Cheers,

- Ted
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?

2016-06-10 Thread Theodore Ts'o
I will observe that feelings have gotten a little heated, so without
making any suggestions to how the python-dev community should decide
things, let me offer some observations that might perhaps shed a
little light, and perhaps dispell a little bit of the heat.

As someone who has been working in security for a long time --- before
I started getting paid to hack Linux full-time, worked on Kerberos,
was on the Security Area Directorate of the IETF, where among other
things I was one of the working group chairs for the IP Security
(ipsec) working group --- I tend to cringe a bit when people talk
about security in terms of absolutes.  For example, the phrase
"improving Python's security".  Security is something that is best
talked about given a specific threat environment, where the value of
what you are trying to protect, the capabilities and resources of the
attackers, etc., are all well known.

This gets hard for those of us who work on infrastructure which can
get used in many different arenas, and so that's something that
applies both to the Linux Kernel and to C-Python, because how people
will use the tools that we spend so much of our passion crafting is
largely out of our control, and we may not even know how they are
using it.

As far as /dev/urandom is concerned, it's true that it doesn't block
before it has been initialized.  If you are a security academic who
likes to write papers about how great you are at finding defects in
other people's work.  This is definitely a weakness.

Is it a fatal weakness?  Well, first of all, on most server and
desktop deployments, we save 1 kilobyte or so of /dev/urandom output
during the shutdown sequence, and immediately after the init scripts
are completed.  This saved entropy is then piped back into /dev/random
infrastructure and used initialized /dev/random and /dev/urandom very
early in the init scripts.  On a freshly instaled machine, this won't
help, true, but in practice, on most systems, /dev/urandom will get
initialized from interrupt timing sampling within a few seconds after
boot.  For example, on a sample Google Compute Engine VM which is
booted into Debian and then left idle, /dev/urandom was initialized
within 2.8 seconds after boot, while the root file system was
remounted read-only 1.6 seconds after boot.

So even on Python pre-3.5.0, realistically speaking, the "weakness" of
os.random would only be an issue (a) if it is run within the first few
seconds of boot, and (b) os.random is used to directly generate a
long-term cryptographic secret.  If you are fork openssl or ssh-keygen
to generate a public/private keypair, then you aren't using os.random.

Furthermore, if you are running on a modern x86 system with RDRAND,
you'll also be fine, because we mix in randomness from the CPU chip
via the RDRAND instruction.

So this whole question of whether os.random should block *is*
important in certain very specific cases, and if you are generating
long-term cryptogaphic secrets in Python, maybe you should be worrying
about that.  But to be honest, there are lots of other things you
should be worrying about as well, and I would hope that people writing
cryptographic code would be asking questions of how the random nunmber
stack is working, not just at the C-Python interpretor level, but also
at the OS level.

My preference would be that os.random should block, because the odds
that people would be trying to generate long-term cryptographic
secrets within seconds after boot is very small, and if you *do* block
for a second or two, it's not the end of the world.  The problem that
triggered this was specifically because systemd was trying to use
C-Python very early in the boot process to initialize the SIPHASH used
for the dictionary, and it's not clear that really needed to be
extremely strong because it wasn't a long-term cryptogaphic secret ---
certainly not how systemd was using that specific script!

The reason why I think blocking is better is that once you've solved
the "don't hang the VM for 90 seconds until python has started up",
someone who is using os.random will almost certainly not be on the
blocking path of the system boot sequence, and so blocking for 2
seconds before generating a long-term cryptographic secret is not the
end of the world.

And if it does block by accident, in a security critical scenario it
will hopefully force the progammer to think, and and in a non-security
critical scenario, it should be easy to switch to either a totally
non-blocking interface, or switch to a pseudo-random interface hwich
is more efficient.


*HOWEVER*, on the flip side, if os.random *doesn't* block, in 99.999%
percent of the cases, the python script that is directly generating a
long-term secret will not be started 1.2 seconds after the root file
system is remounted read/write, so it is *also* not the end of the
world.  Realistically speaking, we do know which processes are likely
to be generating long-term cryptographic secrets imnmediately after