2017/4/3 9:44:43 -0700, Andrew Dinn <ad...@redhat.com>: > On 03/04/17 16:50, Alan Bateman wrote: >> ... >> >> Java SE 9 / JDK 9 brings strong encapsulation. The access control for >> the Java Language and VM has been extended to modules so that modules >> that don't want their internals to be accessed from code outside the >> module can do so. None of the core modules want their internals to be >> accessed so none of the core modules are open or open any packages. A >> consequence of this is that code on the class path or module path >> doesn't get to break in these modules. This is really nice but it >> exposes a lot of technical debt in existing code (as we've seen in mails >> here over the last 18 months). > > Yes, it is really nice in many ways. But it is also not necessarily what > everyone wants. > > One thing I am not clear on as regards that 'really nice' is whether > anything in the JVM wants -- or even hopes -- to rely on module > encapsulation never being broken in order to provide 'semantic' > guarantees. That's different from relying on it to provide security > guarantees. The sort of thing I am thinking about is, say, a module-wide > global analysis in the JIT guaranteeing that a call argument will only > ever be non-NULL, a positive int, or some such invariant that can fed > into an optimization phase. I can understand how a switch to disable > dynamic agent loading might be needed to underline that sort of guarantee.
That's exactly the kind of thing we want to enable, long-term, and one reason why integrity is worth improving aside from any considerations of security -- and why suggestions by others to "just use a security manager" if you care about such things are beside the point. (An even simpler example than the ones you mention is the fact that `final` doesn't mean "final" in the face of deep reflection. This either prevents or greatly complicates optimizations that are based on or enabled by constant folding, which is an awful lot of them.) In addition to future performance improvements, let's not forget about maintainability. Improved integrity allows those of us who maintain the JDK to change internal implementation details without having to worry about breaking user code. It allows users of the JDK to be confident that their code depends only on supported interfaces, so that they can more easily upgrade from one release to the next. We've all too often, over the last twenty years, had to back out internal changes because some popular library depended upon those internals. A common response to this concern is to say, "then don't use libraries that depend upon JDK internals!" Would that it were that simple. The hardest cases arise when some library A that depends on JDK internals is published, then it's used by library B, which in turn is used by library C, and that in turn is used by the immensely-popular library D, whose authors don't even know that library A depends on JDK internals. (They might not even know that D ultimately depends on A!). Then we fix a bug in the JDK that changes those internals, library D breaks, a large number of popular applications (or applications with important customers) stop working, and the cries and support escalations demanding that the change be reverted ensue. Anyway ... at the highest level, improving platform integrity is not just about improving security. It's also about improving performance, in the long term, and about improving the maintainability of everybody's code, even in the short term. >> . . . >> Now bring the attach API and late binding agents into the picture. This >> is where things blur and where the problem arises. A library can use the >> attach mechanism to load an agent into the current VM and break into any >> module. It's much easier in JDK 9 compared to previous releases because >> the jdk.attach module is resolved by default. All it takes is someone to >> post a solution on stackoverflow that spins a sneaky agent to leak the >> Instrumentation object to the library. It's just too easy to "migrate" >> existing reflection hacks. > > I think this is already very well known technology and the presence of > the jdk.attach module in the runtime was never really much of a bar. The > people who want to do this certainly don't have to look for sneaky > solutions on stackoverflow. However, I still don't really see your point > here. If people want to migrate existing reflection hacks then they can > and will do so by switching off your proposed flag or adding the agent > at startup. Sure, but both of those approaches require access to the command line, and we already have no choice but to trust those who have such access, so there's no additional risk (or sneakiness) there. > Are you perhaps concerned that users might have their hand forced by > providers of library jars or middleware who hoist an agent into the JVM > behind their backs? I think that would be rather a patronising view to > take of the vast majority of producers and consumers of libraries and/or > middleware. My belief is that anyone who attempted to provide a library > or framework (open source or not) that disabled (some or all) modules by > stealth would very soon be found out. I also believe they would > immediately lose all (or, at least, most) of their prospective business > from serious, paying customers. After all, that sort of behaviour > *would* be a major security issue. I wish I could share your optimism in this regard, but it's contrary to my experience. That a library does questionable things does not always dissuade people from using it. An instance of the scenario I described above, where you can read "popular applications" as "every major Java EE application server", occurred late in the development of JDK 5. Some obscure library was reaching into JDK internals. The fact that it did so was well-documented and (presumably) clear to its immediate users but did not prevent it from becoming widely used. It was apparent at the time that the true nature of this library surprised some of those responsible for higher-level components that depended indirectly upon it. So, yes, call it patronising if you like but we are concerned about enabling a library to self-attach and inject an agent which it uses to subvert strong encapsulation without the developer (or the deployer) knowing about it. I fully agree that sophisticated late-binding agents can have legitimate needs to break strong encapsulation, and that they should be permitted to do so in the full sight of their users. What I don't yet see is a way to enable those by default while at the same time disabling sneaky encapsulation-busting libraries whose inevitable inadvertent use will lead to maintenance headaches down the line. > As I mentioned in my reply to Mark it is critical for users to know > exactly how any agent they load into their JVM is going to modify the > access restrictions in place in the JVM so that they can be sue that use > of the agent is safe (just as they need to know that any jars they place > into the classpath are not going to do things like round down their > costs and pocket the spare change in a bank account somwehere). Users > don't just add stuff to their classpath without knowing what it does and > why it is appropriate. Again, I wish this were true, but it's not. Developers all too often just add stuff to the class path until things work or, more commonly, ask build tools to do that for them via dependency management. They ship the result and then, over time, others come to depend upon it. If developers and deployers have to to opt-in to breaking encapsulation on the command line then that at least makes it clear to someone trying to diagnose a failing system that something fishy might be going on. >> The attach mechanism was of course never intended to be used this way. >> It was meant for troubleshooting tools and profilers/similar to load >> agents into running VMs. Back in the JDK 6 then we did consider >> disallowing attaching to the current VM but didn't enforce it - one >> reason is that it's not hard to just fork a VM with tools.jar on the >> class path and connect back to the parent. > > Well, I understand that this is not what you intended. However, i) it > turns out to have been very useful that it does work this way and ii) > stopping it doing so has a cost which needs to be taken into account -- > at the very least by giving those who have been relying on it for quite > some time to manage their business concerns to adjust to the change. No argument there. >> So that is the context for the discussion. We need to find a good way to >> put the Genie back in its bottle. It may be that we have to disable >> attaching to the current or ancestor VMs. We may have to prohibit the >> instrumentation of core modules by late binding agents. We may have to >> do some disabling of agent loading. Maybe a combination. Suggestions and >> proposals are of course welcome. > > I'm very happy to consider all sorts of half-way houses or even -- in > time -- the full change recommended in the original JIRA. For example, > rejecting instrumentation in some specific set of bootstrap/JDK module > classes (like, say, java.base) from an agent which has been dynamically > loaded might well be a workable compromise -- that at least allows users > to employ an agent to tweak any code that is in the classpath through > their choice. That's an interesting option, and worth exploring further. It could, though, be troublesome for legitimate late-binding agents that instrument JDK internals. Is Byteman typically used in that way? - Mark