Hi Cedric,

On 05/16/2017 12:05 PM, Cédric Champeau wrote:
Thanks Peter for your thoughts, but I don't think it's as simple as that. As I explained in my original email, there are multiple ways the environment variables can be mutated, and it can be done _externally_. Typically, during a task execution, a JNI call performed by a random native tool could mutate the environment. That's something we, as a build tool, have to consider as a use case. It's very unlikely but it can happen. This means that for the same classloader, the environment can change.

Reading the updated environment could be performed by Gradle using JNI after such plugin has finished, right?

And for performance reasons, we cache classloaders between builds, and reuse them as much as possible (because classloading is far from being cheap).

What about that instead of re-spawning the whole daemon when environment changes, Gradle just created new ClassLoader for plugins with updated environment variables. This would also account for situations where plugins "cache" the result of System.getenv() using static fields like:

public class SomePlugin {
static final String SOME_CONFIGURATION = System.getenv("SOME_CONFIGURATION");

...and if you also keep ClassLoader(s) around for some time (via Map<String, SoftReference<ClassLoader>>) using a "hash" of the environment variables as a key into the Map, the daemon would typically stabilize with a set of class loaders invoked from various client processes, each having a stable set of environment variables.

Regards, Peter


2017-05-14 19:51 GMT+02:00 Peter Levart <peter.lev...@gmail.com <mailto:peter.lev...@gmail.com>>:

    Hi Cedric,

    Chiming in with my thoughts...

    It's unfortunate that Gradle plugins or libraries used by plugins
    use environment variables at all. I wonder who was the first? Did
    Gradle introduce the feature of passing client environment to the
    daemon and establishing the view of System.getenv for plugins 1st
    or did libraries used by plugins happen to use environment
    variables using System.getenv and Gradle was just kind enough to
    provide them a dynamic view controlled by client?

    In the end it doesn't matter. The fact is that System.getenv is
    part of standard Java API and anyone using it should be aware that
    by doing so, they are limiting their software to be (re)configured
    only by spawning new process(es). UNIX environment was not
    designed to be mutated during the course of a long-running
    process. Maybe just at process startup/setup when conditions are
    under control (i.e. a single running thread) but later, the
    environment is meant to be read only.

    Maybe there is a solution for Gradle and othert though. But that
    solution, I think, is not in exposing a "live" view of process
    environment through System.getenv or new methods to "refresh" the
    view as you are proposing, since that would only encourage people
    to mutate the process environment which, as was established on
    this thread, is not safe in a long-running multi-threaded process,
    which Java processes usually are. Perhaps the solution is in
    extending the System.getenv API with ways to provide "custom"
    views of variables for code that runs in a "container".

    Gradle is, among other things, a container for plugins. Is Gradle
    running plugins in a separate ClassLoader? Does it use a separate
    ClassLoader for each "build" which might bring with it a new set
    of environment variables from the client? In such a setup, one
    could provide a separate set of environment variables for each
    ClassLoader, simply by passing them to the ClassLoader
    constructor. System.getenv would need to be a @CallerSensitive
    method which would return caller's ClassLoader view of variables,
    if any such view was established, or simply the view inherited
    from the parent ClassLoader.

    Such would be a "functional" approach, which would keep
    environment variables immutable, but allow child "contexts" to
    have different views of them. Such approach would also play well
    with idioms like:

    class AbcPlugin {
        static final String SOME_SETTING = System.getenv("SOME_SETTING");

    ...and would also help other containers (such as app servers)
    support existing libraries that use environment variables to be
    configured differently in different apps deployed in the same VM
    process.

    Just a thought.

    Regards, Peter



    On 05/11/2017 09:02 AM, Cédric Champeau wrote:
    equalThanks for the answers, folks, but I think they are kind of missing the
    point. The fact is that environment variables *are* mutable. Java happens
    to return an immutable view of the environment variables when the VM was
    started, which is not the reality. We cannot trust what `System.geteenv`
    returns. The Javadocs say "current system environment" but it's just not
    true. The method should be called `getEnvWhenTheVMWasStarted`.

    We also have the problem that the environment of the client and the daemon
    differ over time, as I explained in the previous email. And it is pretty
    easy to understand that _when the build runs_, we need to get the
    environment variables of the _client_, not the daemon. Imagine something as
    simple as this:

    String toolPath = System.getenv('SOMETOOL_HOME')

    and imagine that this code runs in the daemon. When the daemon is started,
    there's absolutely no guarantee that this variable is going to exist.
    However, we know that when we're going to execute the build, this variable
    *has* to be defined. Obviously, it's going to be done from the CLI:

    $ export SOMETOOL_HOME = ...
    $ ./gradlew run ---> connects to the daemon, passes environment variables,
    mutates those of the daemon, which then executes the build

    Similarly, from a *single* CLI/terminal, it's very common to change the
    values of environment variables (because, hey, they are mutable!):

    $ export SOMETOOL_HOME = another path I want to try out
    $ ./gradlew run --> ... must update env vars again

    So, to work around the problem that Java doesn't provide a way to mutate
    the current environment variables, we perform a JNI call to do it. *Then*
    we need to mutate the "immutable view" that Java provides, or all Java code
    is going to get wrong information, and we would never succeed. Note that
    having to resort on JNI to mutate the environment is not the problem. We
    can live with that. Once can think it's a bad idea to allow mutation, in
    practice, it is necessary, but I reckon it could be a bad idea to have an
    API for this. The *real* issue is that `System.getenv` does *not* return
    the real values, because it's an immutable view of what existed at the VM
    startup.

    Note that it's not recommended to mutate environment variables in a
    multi-threaded environment. But in practice, you can assimilate what we do
    to the VM startup: we get environment variables, mutate them, _then_ the
    build runs (and only at that moment we would effectively be
    multi-threaded). We can live with potential issues of mutating the
    environment: the benefits outperforms the drawbacks by orders of magnitude.
    When the daemon is activated, we're not talking about 10% faster builds.
    They can effectively be 3, 5 or 10x faster!

    Now, back to the problem with JDK 9:

    - first, the implementation has changed. But we could adapt our code.
    - second, calling `setAccessible` is not allowed anymore. And this is a
    MAJOR pita. The problem is that we're trying to access the java base
    module, and reflection on that module is not allowed anymore. There are no
    good solutions for this: we could write a JVM agent, like you suggested,
    that rewrites bytecode. But since we're trying to rewrite a core class, it
    would have to be found on the invocation of `java` command itself, and
    cannot be dynamically loaded. In addition, we would have to add a flag to
    open core classes to rewrite. There are multiple problems with this
    approach: first, we don't know on which environment we run before we're
    started. Gradle can run on JDK 7, 8, 9, ... and we cannot have a startup
    script which tries to infer the version from whatever it finds, this is not
    realistic and would add unacceptable latency on the client side (plus, a
    lot of headaches to support the various environments we support: Linux,
    Mac, Windows, ...). Second, we may not have a hand on the CLI at all. For
    example, if the build runs in Travis CI, Amazon, or whatever cloudish thing
    that spawns its own containers, we cannot tweak the VM arguments. We just
    have to use whatever is there. Last, rewriting bytecode has a cost, that
    you pay at startup.

    Again, what we need is that Java just returns the live view of the
    environment variables. Allowing *mutation* is not necessary, technically
    speaking, because we can rely on JNI. However, I reckon that returning an
    immutable view is done for performance reasons. That's why adding a way to
    mutate "the view" would be an acceptable thing I think. No reflection, no
    bytecode rewriting, just give a way to mutate the map that
    `ProcessEnvironment` retains. The advantage of this is that you would not
    effectively mutate the environment variables, so you, on the JVM side,
    would be safe. What you would mutate is the view you are retaining.
    Alternatively, provide us with an API to "refresh" the view. Like,
    `System.getenv(true)`. The benefit would be that you would only have to get
    the new view of environment variables if the user asks for it. And,
    subsequent callers would get the refreshed view, so people using `gettenv`
    in their code would be up-to-date.

    I'm really concerned that this problem is not taken seriously. It's a major
    performance problem for the most widely used build tool out there
    (considering all users, from native to Java and Android). Just saying
    "you're doing something nasty" is not a valid answer: we have to do it, and
    do it for good reasons.


    2017-05-11 4:50 GMT+02:00 Martin Buchholz<marti...@google.com> 
<mailto:marti...@google.com>:

    Since you're OK with doing brain surgery on the JDK and you control the
    entire process, consider controlling your daemon with a bytecode rewriting
    agent (e.g. byteman) that changes the definition of System.getenv etc.

    (Whose side are you on Martin?! ...  I confess I also wish for a faster
    gradle ...)




Reply via email to