Re: Disallowing the dynamic loading of agents by default

Christoph Engelbert Sun, 02 Apr 2017 23:30:07 -0700

Hi Mark, hi others,

First of all, I understand the idea behind this change and I think it certainly 
makes sense but from my impression the default is wrong, as Volker already 
pointed out.


Over the last few days I (with the help of others) put together a document 
(https://docs.google.com/document/d/19H1iGUnyI4Y40U5sNVAmH0uLxcpNfYSjPIW3s0Rv5Ss/edit?usp=sharing
 
<https://docs.google.com/document/d/19H1iGUnyI4Y40U5sNVAmH0uLxcpNfYSjPIW3s0Rv5Ss/edit?usp=sharing>)
 to see wich tools are affected by this change. I want to apologize for the 
wording of the initial tweet, as Brian correctly pointed out on twitter, that 
asking for “who thinks it’s a bad idea” doesn’t offer unbiased data, however 
the idea was not a voting but to collect those tools which rendered the tweet 
more as an engagement to add content to the document. Anyhow, apologies if this 
tweet came out the wrong way.

Looking at APM, as Martijn, mentioned, I don’t see a lot of impact, as most 
APMs should be added right from the start of the JVM. On the other hand, 
however, it seems that there is a lot of tools (probably more on the “devops” 
side of things), that are commonly added at runtime in case of a problem or 
error. Those tools would be greatly affected by the change and would require to 
commonly deactivate the new restriction which renders it kind of useless.

That said it looks like the main group being affected by this change is not 
developers, as you seem to have mentioned in the initial mail, but operations. 
Furthermore I’m not sure I agree with “if you have to tell customers to put 
additional flags on CL, one more is no problem” (as it sounded below). Normally 
you have to explain / fight over every single command line parameter that has 
to be set with the customers operations team (except those parameters are GC 
configs ;-). That means it’ll be really hard to explain why to deactivate 
something that undercuts the system security / integrity, as it will be put.

Most interestingly, as the document points out, there will be ways to undermine 
the change by creating a remote thread (on Windows) or ptrace on Linux. There 
are certainly ways on each of the operating systems but it’ll make things even 
more insecure.

I would like to see this still being enabled by default but recommended to be 
deactivated when non of those tools are required. That would come back to what 
Volker suggested: -XX:+DisableDynamicAgentLoading. Otherwise, from my point of 
view, most operations teams will have to activate dynamic loading anyways.

As a final but short disclaimer: Hazelcast is not directly affected by this 
change but we see customers using debugging, profiling and tracing / reporting 
tools (same as APMs) that are added at runtime and they’re often required to 
create meaningful error reports for us. That’s why I care for this change, I 
guess a lot of people think the same way.



Christoph Engelbert
Manager Developer Relations 

> On 3. Apr 2017, at 01:39, Stephen Felts <stephen.fe...@oracle.com> wrote:
> 
> I agree with Andrew's position that if the argument is added in JDK9, it 
> should default to allow dynamic loading of agents.
> 
> Arguing from the position "Isn't it already the case, however, that migrating 
> existing applications to JDK 9 is often going to require the use of a few new 
> options anyway, in order to expose internal APIs" isn't a valid argument IMO. 
>  Although migration to JDK 9 will be painful, I think that we will get to 
> zero JDK 9 command line arguments.  As proposed, this new argument will never 
> go away.
> 
> It's highly likely that customers will have scripts that they migrate from 
> JDK 8 to JDK 9.  We don't control that.
> And many developers don't use any scripts because for many cases, they don't 
> care about the garbage collector or memory or whatever the scripts provide.
> But they do care about product functionality provided by an agent.
> 
> 
> 
> -----Original Message-----
> From: Andrew Dinn [mailto:ad...@redhat.com] 
> Sent: Friday, March 31, 2017 5:46 AM
> To: Mark Reinhold
> Cc: jigsaw-dev@openjdk.java.net
> Subject: Re: Disallowing the dynamic loading of agents by default
> 
> Hi Mark,
> 
> On 30/03/17 16:38, mark.reinh...@oracle.com wrote:
>> // Moving the general discussion to jigsaw-dev for the record; // 
>> bcc'ing {hotspot-runtime,serviceability}-dev for reference.
>> 
>> Andrew,
>> 
>> Thanks for your feedback on this topic [1][2][3].
> 
> ... and thank you for your considered reply.
> 
>> First, we apologize for the way in which this topic was raised.  Our 
>> intent was to post a proposal for discussion prior to code review, but 
>> unfortunately that review was posted prematurely (as is evident by its 
>> inclusion of Oracle-internal e-mail addresses and URLs).
> 
> Hmm, yes! I must say I didn't notice that. I appreciate the apology but it's 
> not really necessary. I certainly didn't expect any explanation to omit some 
> element of miscommunication and/or cock-up :-)
> 
>> Second, I agree with your earlier analysis as to the security impact 
>> of this change.  If an attack is possible via this vector then closing 
>> the vector would only slow the attack, not prevent it.
> 
> Good, I am glad to hear there is not some terrible loop-hole at play that I 
> am not aware of.
> 
>> The motivation for this change is, however, not merely to improve the 
>> security of the platform but to improve its integrity, which is one of 
>> the principal goals of the entire modularity effort.  ...
> 
> Ok, I understand the motive here although I'm still not personally convinced 
> by it. I'll come to the practical considerations below. Before that I'd like 
> to address the question of integrity at a more abstract level.
> 
> I'm certainly not against providing -XX+/-EnableDynamicAgentLoading as a 
> command line option. I agree that it's probably useful for some users to have 
> the option to completely lock down the platform to guarantee its integrity. 
> It seems from what you say above that this lock-down option is only there to 
> provide 'belt and braces'. In other words, it is only necessary to guard 
> against a security breach that could be managed by other means (e.g. a 
> failure to control what jars go into your classpath; a failure to control 
> access to the JVM uid on on the JVM host machine).
> I cannot fault the idea of a belt and braces lockdown per se but I am still 
> not convinced why that extra protection needs to be enabled /by default/.
> 
> You specifically bring up the scenario where rogue code, once entered into 
> the JVM, might use the attach API to raise its privilege level.
> 
> "As things stand today, code in any JAR file can use the 
> `VirtualMachine.loadAgent` API to load an agent into the JVM in which it's 
> running and, via that agent, break into any module it likes."
> 
> Yet, you also acknowledge above that this merely constitutes an opportunistic 
> escalation of a situation that is already a serious security breach in its 
> own right. I don't think I follow the logic here.
> 
> Are you saying that we need the extra braces because there is a real danger 
> here? one that users cannot rightly always be expected to guard against? Or 
> are you just being extra cautious. This is really the crux of the matter 
> because that extra caution has to be weighed against the extra cost of lost 
> opportunities to deploy agents in abnormal situations.
> 
> n.b. I know in the case of Red Hat's middleware that this is a real cost 
> which will definitely arise no matter how hard we work to educate users about 
> the necessary advance preparation required. It is also a significant cost 
> because it will damage our ability to resolve certain very difficult support 
> issues where only an agent can provide the information needed. And that is 
> above above and beyond the cost of the re-education task itself. I don't 
> doubt other companies will be affected similarly.
> 
> My mention above of 'abnormal' situations underlines why your argument about 
> integrity is somewhat moot (to me). Yes, it is important to know that 
> encapsulation means encapsulation -- at least, I agree that is so in /normal/ 
> circumstances. However, agents are clearly not normal code performing the 
> normal program operations of an application. Many agents are specifically 
> designed fro deployment in abnormal situations and perform abnormal actions. 
> That is precisely what provides the impetus to deploy agents dynamically.
> 
> It is highly valuable in such circumstances, and only in those circumstances, 
> to be able to allow privileged agent code to /selectively/ remove certain 
> integrity barriers, even if -- perhaps, especially because -- any dismantling 
> of the normal rules of operation only happens modulo the specific licence the 
> agent has been crafted and configured to grant. Useful agents clearly scope 
> the degree to which they perturb normality to achieve abnormal results. 
> Careful and thoughtful users can (must) still feel safe that an agent is not 
> going to do catastrophic damage to the running application and the integrity 
> of its data and operation. Ironically, this means that deployment of my agent 
> is actually a relatively normal (even if infrequent) procedure for many of 
> our users.
> 
> So, while I agree that platform (or even application) integrity is a valuable 
> property to maintain in normal program operation, I don't think those 
> concerns are warranted in the case of an agent that has been deliberately and 
> carefully deployed by those in charge of an application. I suspect we are 
> probably not going to agree about the proposed default on these grounds (and 
> I also suspect I will not be the only one to disagree with your position). 
> So, perhaps we would be better off moving on to pragmatic concerns.
> 
>> I understand your points about the practical difficulties of having to 
>> educate users about this new option and enhance startup scripts to use 
>> the option only when invoking JDK 9.  Isn't it already the case, 
>> however, that migrating existing applications to JDK 9 is often going 
>> to require the use of a few new options anyway, in order to expose internal 
>> APIs?
>> If so then would it really be that much more burdensome for users also 
>> to think explicitly, at the same time, about whether they want to 
>> enable dynamic agent loading?
> 
> If the default is reset to allow dynamic loading then I am happy to fully 
> endorse this change and see no significant consequences. If this change is 
> going to happen with your proposed default then I would very much prefer it 
> to be staged: introduce the flag in 9 but with the default being to allow 
> dynamic loading of agents (i.e. default to the status quo); reset the default 
> in 10 to disable loading. The benefit of that is
> 
>  aware JDK9 users can still use or ignore the option as they see fit
> 
>  unaware JDK9 users will not get hit by the change by surprise in JDK9
> 
>  unaware JDK10 users may still get hit by surprise but by that stage any 
> configuration option they add to their JDK10 scripts will be compatible if 
> they need to switch back and forth between JDK10 and JDK9
> 
>  implementers of agents and implementers of middleware that might benefit 
> from using those agents have more time to prepare their users, limiting the 
> potential for any such nasty surprise in JDK10
> 
>> This change would be disruptive to some but it's the best way we've 
>> found, so far, to preserve platform integrity in the face of dynamic 
>> agent loading.  If there's a better way to do that, we'd like to know.
> 
> No, I don't think there is a better mechanism, only a better default.
> That reflects my belief that, while 'preserving platform integrity' is a 
> highly desirable goal, for most users it does not merit being pursued 'in the 
> face of dynamic agent loading'.
> 
> regards,
> 
> 
> Andrew Dinn
> -----------
> Senior Principal Software Engineer
> Red Hat UK Ltd
> Registered in England and Wales under Company Registration No. 03798903
> Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander

Re: Disallowing the dynamic loading of agents by default

Reply via email to