[
https://issues.apache.org/jira/browse/NIFI-3313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15902290#comment-15902290
]
ASF GitHub Bot commented on NIFI-3313:
--------------------------------------
GitHub user alopresto opened a pull request:
https://github.com/apache/nifi/pull/1579
NIFI-3313 Added explicit Java runtime argument to default bootstrap.c…
…onf to avoid blocking on VM deployment.
This PR needs review in a specific environment. The reported issue is that
NiFi running in a container or Virtual Machine environment that does not have
access to sufficient entropy will block indeterminately on startup, right after
the "Loaded *n* properties" message:
```
2017-03-08 16:38:07,479 INFO [main] org.apache.nifi.NiFi Launching NiFi...
2017-03-08 16:38:07,656 INFO [main]
o.a.nifi.properties.NiFiPropertiesLoader Determined default nifi.properties
path to be
'/Users/alopresto/Workspace/nifi/nifi-assembly/target/nifi-1.2.0-SNAPSHOT-bin/nifi-1.2.0-SNAPSHOT/./conf/nifi.properties'
2017-03-08 16:38:07,659 INFO [main]
o.a.nifi.properties.NiFiPropertiesLoader Loaded 124 properties from
/Users/alopresto/Workspace/nifi/nifi-assembly/target/nifi-1.2.0-SNAPSHOT-bin/nifi-1.2.0-SNAPSHOT/./conf/nifi.properties
2017-03-08 16:38:07,665 INFO [main] org.apache.nifi.NiFi Loaded 124
properties
```
I have added a Java runtime argument to `conf/bootstrap.conf` which directs
Java to point the Entropy Generating Device (`java.security.egd`) to
`/dev/urandom`. This is *not* a security concern because NiFi is *not*
generating long-lived secrets at startup (many additional explanatory resources
in NIFI-3313).
However, I cannot reproduce the original issue locally. I have tried
running the application on my native OS (Mac OS X 10.11.6), in a Docker
container (`aldrin/apache-nifi`) on the Boot2Docker ISO, and in a Docker
container (`aldrin/apache-nifi`) on a new Ubuntu Xerial 16.04.2 LTS
installation inside VirtualBox. In none of these environments could I
successfully block NiFi from starting.
I request that whoever reviews this is someone who has encountered the
blocking issue and can consistently reproduce it in order to ensure this change
solves the problem. I have run the patched version on native OS (i.e. direct
access to PRNG) and there were no ill effects.
<hr>
Thank you for submitting a contribution to Apache NiFi.
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?
- [ ] Does your PR title start with NIFI-XXXX where XXXX is the JIRA number
you are trying to resolve? Pay particular attention to the hyphen "-" character.
- [ ] Has your PR been rebased against the latest commit within the target
branch (typically master)?
- [ ] Is your initial contribution a single, squashed commit?
### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies
licensed in a way that is compatible for inclusion under [ASF
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] If applicable, have you updated the LICENSE file, including the main
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to
.name (programmatic access) for each of the new properties?
### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in
which it is rendered?
### Note:
Please ensure that once the PR is submitted, you check travis-ci for build
issues and submit an update to your PR as soon as possible.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/alopresto/nifi NIFI-3313
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/nifi/pull/1579.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1579
----
commit 654d616407cc7271d818b8902a17e9dafcafbb2f
Author: Andy LoPresto <[email protected]>
Date: 2017-03-09T00:44:49Z
NIFI-3313 Added explicit Java runtime argument to default bootstrap.conf to
avoid blocking on VM deployment.
----
> First deployment of NiFi can hang on VMs without sufficient entropy if using
> /dev/random
> ----------------------------------------------------------------------------------------
>
> Key: NIFI-3313
> URL: https://issues.apache.org/jira/browse/NIFI-3313
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core Framework
> Affects Versions: 1.1.1
> Reporter: Andy LoPresto
> Assignee: Andy LoPresto
> Priority: Critical
> Labels: entropy, security, virtual-machine
>
> h1. Analysis of Issue
> h2. Statement of Problem:
> NiFi deployed on headless VM (little user interaction by way of keyboard and
> mouse I/O) can take 5-10 minutes (reported) to start up. User reports this
> occurs on a "secure" cluster. Further examination is required to determine
> which specific process requires the large amount of random input (no steps to
> reproduce, configuration files, logs, or VM environment information
> provided).
> h2. Context
> The likely cause of this issue is that a process is attempting to read from
> _/dev/random_, a \*nix "device" providing a pseudo-random number generator
> (PRNG). Also available is _/dev/urandom_, a related PRNG. Despite common
> misperceptions, _/dev/urandom_ is not "less-secure" than _/dev/random_ for
> all general use cases. _/dev/random_ blocks if the entropy *estimate* (a
> "guess" of the existing entropy introduced into the pool) is lower than the
> amount of random data requested by the caller. In contrast, _/dev/urandom_
> does not block, but provides the output of the same cryptographically-secure
> PRNG (CSPRNG) that _/dev/random_ reads from \[myths\]. After as little as 256
> bytes of initial seeding, accessing _/dev/random_ and _/dev/urandom_ are
> functionally equivalent, as the long period of random data generated will not
> require re-seeding before sufficient entropy can be provided again.
> As mentioned earlier, further examination is required to determine if the
> process requiring random input occurs at application boot or only at
> "machine" (hardware or VM) boot. On the first deployment of the system with
> certificates, the certificate generation process will require substantial
> random input. However, on application launch and connection to a cluster,
> even the TLS/SSL protocol requires some amount of random input.
> h2. Proposed Solutions
> h3. rngd
> A software toolset for accessing dedicated hardware PRNG (*true* RNG, or
> TRNG) called _rng-tools_ \[rngtools\] exists for Linux. Specialized hardware,
> as well as Intel chips from IvyBridge and on (2012), can provide
> hardware-generated random input to the kernel. Using the daemon _rngd_ to
> seed the _/dev/random_ and _/dev/urandom_ entropy pool is the simplest
> solution.
> *Note: Do not use _/dev/urandom_ to seed _/dev/random_ using _rngd_. This is
> like running a garden hose from a car's exhaust back into its gas tank and
> trying to drive.*
> h3. Instruct Java to use /dev/urandom
> The Java Runtime Environment (JRE) can be instructed to use _/dev/urandom_
> for all invocations of {{SecureRandom}}, either on a per-Java process basis
> \[jdk-urandom\] or in the JVM configuration \[oracle-urandom\], which means
> it will not block on server startup. The NiFi {{bootstrap.conf}} file can be
> modified to contain an additional Java argument directing the JVM to use
> _/dev/urandom_.
> h2. Other Solutions
> h3. Entropy Gathering Tools
> Tools to gather entropy from non-standard sources (audio card noise, video
> capture from webcams, etc.) have been developed such as audio-entropyd
> \[wagner\], but these tools are not verified or well-examined -- usually when
> tested, they are only tested for the strength of their PRNG, not the ability
> of the tool to capture entropy and generate sufficiently random data
> unavailable to an attacker who may be able to determine the internal state.
> h3. haveged
> A solution has been proposed to use {{havaged}} \[haveged\], a user-space
> daemon relying on the HAVEGE (HArdware Volatile Entropy Gathering and
> Expansion) construct to continually increase the entropy on the system,
> allowing _/dev/random_ to run without blocking.
> However, on further investigation, multiple sources indicate this solution
> may be insecure \[dice\]\[leek-havege\].
> Michael Kerrisk:
> bq. Having read a number of papers about HAVEGE, Peter \[Anvin\] said he had
> been unable to work out whether this was a "real thing". Most of the papers
> that he has read run along the lines, "we took the output from HAVEGE, and
> ran some tests on it and all of the tests passed". The problem with this sort
> of reasoning is the point that Peter made earlier: there are no tests for
> randomness, only for non-randomness.
> bq. One of Peter's colleagues replaced the random input source employed by
> HAVEGE with a constant stream of ones. All of the same tests passed. In other
> words, all that the test results are guaranteeing is that the HAVEGE
> developers have built a very good PRNG. It is possible that HAVEGE does
> generate some amount of randomness, Peter said. But the problem is that the
> proposed source of randomness is simply too complex to analyze; thus it is
> not possible to make a definitive statement about whether it is truly
> producing randomness. (By contrast, the HWRNGs that Peter described earlier
> have been analyzed to produce a quantum theoretical justification that they
> are producing true randomness.) "So, while I can't really recommend it, I
> can't not recommend it either." If you are going to run HAVEGE, Peter
> strongly recommended running it together with rngd, rather than as a
> replacement for it.
> Tom Leek:
> bq. Of course, the whole premise of HAVEGE is questionable. For any practical
> security, you need a few "real random" bits, no more than 200, which you use
> as seed in a cryptographically secure PRNG. The PRNG will produce gigabytes
> of pseudo-\[data\] indistinguishable from true randomness, and that's good
> enough for all practical purposes.
> bq. Insisting on going back to the hardware for every bit looks like yet
> another outbreak of that flawed idea which sees entropy as a kind of
> gasoline, which you burn up when you look at it.
> h2. Next Steps
> As described above, further investigation is necessary, but moving forward,
> barring new information, I would propose directing the JVM to use
> _/dev/urandom_ and making _rngd_ available to systems that support a TRNG.
> [myths] http://www.2uo.de/myths-about-urandom/
> [rngtools]
> https://git.kernel.org/cgit/utils/kernel/rng-tools/rng-tools.git/about/
> [jdk-urandom] http://stackoverflow.com/a/2325109/70465
> [oracle-urandom]
> https://docs.oracle.com/cd/E13209_01/wlcp/wlss30/configwlss/jvmrand.html
> [wagner] https://people.eecs.berkeley.edu/~daw/rnd/
> [haveged] http://www.issihosts.com/haveged/
> [dice] https://lwn.net/Articles/525459/
> [leek-havege] http://security.stackexchange.com/a/34552/16485
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)