[ 
https://issues.apache.org/jira/browse/HADOOP-13070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15742855#comment-15742855
 ] 

Sangjin Lee commented on HADOOP-13070:
--------------------------------------

I’ve been testing the latest patch (HADOOP-13998) and there seems to be one 
interesting issue with the stricter classpath isolation, and that has to do 
with the {{ServiceLoader}}.

(1) {{ServiceLoader}}
The {{ServiceLoader}} essentially uses the following pattern to load service 
classes dynamically:
{code}
Enumeration<URL> defs = 
classloader.getResources(“META-INF/services/org.foo.ServiceInterface”);
while (defs.hasMoreElements()) {
  URL def = defs.nextElement();
  Iterator<String> names = parse(def);
  for (String name : names) {
    ServiceInterface si = (ServiceInterface)Class.forName(name, classloader);
  }
}
{code}
First off, {{ClassLoader.getResources()}} will return all service files, 
*regardless of* where the service file is, either in the user classpath or the 
parent classpath (bit more discussion on {{ClassLoader.getResources()}} below).

Since all service files have been located and the calling class of 
{{Class.forName()}} is {{ServiceLoader}} which is a system class, all service 
classes will be successfully loaded, *regardless of* where the service class 
is, either in the user classpath or the parent classpath.

Technically this would represent an opportunity to circumvent the isolation and 
load stuff from the parent classpath. That said, we could still regard this as 
a variation of a “system facility providing a way to load things from both 
classpaths” case mentioned in the proposal (section 2-1).

I thought about plugging this possibility, but there doesn’t seem to be an 
unambiguous way to do this.

One approach I considered is to go above the call stack to identify who’s 
calling {{ServiceLoader.load()}}. Suppose we use the calling class to enforce 
stricter loading. If a user class is a calling class, it would load service 
files from both the user classpath and the parent classpath. However, as it 
iterates over the classes, it will fail to load a non-system parent class. This 
causes a *hard* failure on the iteration on {{ServiceLoader}}.

On the other hand, we could try to determine somehow a certain service file is 
a “non-system parent service file” and not return that service file resource 
with {{ClassLoader.getResources()}} to begin with. However, the notion of a 
“non-system parent service file” is not well defined, and I don’t think there 
is a way to define this clearly.

I think the best way forward is to allow {{ServiceLoader}} to load services 
from both the user and the parent classpath. I’d love to hear your thoughts on 
this.

(2) {{ClassLoader.getResources()}}
Currently {{ApplicationClassLoader}} does not override this. The javadoc for 
{{ClassLoader.getResources()}} states:
{noformat}
…
The search order is described in the documentation for getResource(String).
{noformat}
Since we do not override this today, we return the resources from the parent 
first and then from the child, which is not quite the same as what the javadoc 
indicates. So it seems to me that at minimum we want to change the order of 
resources so that it returns the child resources first.

The next question is whether it should return a (non-system) parent resource if 
a user class calls this method. We could tighten this to filter out non-system 
parent resource. I am leaning towards making that change.

Thoughts? Feedback? Concerns?
cc [~busbey]


> classloading isolation improvements for stricter dependencies
> -------------------------------------------------------------
>
>                 Key: HADOOP-13070
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13070
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sangjin Lee
>            Assignee: Sangjin Lee
>            Priority: Critical
>         Attachments: HADOOP-13070.poc.01.patch, Test.java, TestDriver.java, 
> classloading-improvements-ideas-v.3.pdf, classloading-improvements-ideas.pdf, 
> classloading-improvements-ideas.v.2.pdf, lib.jar
>
>
> Related to HADOOP-11656, we would like to make a number of improvements in 
> terms of classloading isolation so that user-code can run safely without 
> worrying about dependency collisions with the Hadoop dependencies.
> By the same token, it should raised the quality of the user code and its 
> specified classpath so that users get clear signals if they specify incorrect 
> classpaths.
> This will contain a proposal that will include several improvements some of 
> which may not be backward compatible. As such, it should be targeted to the 
> next major revision of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to