[
https://issues.apache.org/jira/browse/MAPREDUCE-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tom White updated MAPREDUCE-1700:
---------------------------------
Attachment: MAPREDUCE-1700.patch
Here's a proof of concept for isolated classloaders in YARN. This approach uses
OSGi for isolation. The idea is that the task JVM uses a Felix container to
load the job JAR (which is an OSGi bundle) so that user code can use whichever
libraries it likes, even if they conflict with system JARs.
In this example I have created a fictitious library with two incompatible
versions. Version 1 is used by the system (in YarnChild) while version 2 is
used by the example Mapper. Without isolation the job fails with a
java.lang.NoSuchMethodError - regardless of whether the user JARs are first or
second on the classpath. When run using isolation, the job succeeds and we can
see that both version 1 and version 2 of the library are used:
{noformat}
/tmp/logs//application_1346151477167_0001/container_1346151477167_0001_01_000002/stdout:message
2
/tmp/logs//application_1346151477167_0001/container_1346151477167_0001_01_000002/syslog:2012-08-28
11:58:52,317 INFO [main] org.apache.hadoop.mapred.YarnChild: message 1
{noformat}
To run:
* Checkout a revision of trunk that doesn't have MAPREDUCE-4068 ('svn up -r
1376252')
* Apply the patch
* Run 'mvn versions:set -DnewVersion=3.0.0' to change the version numbers to
non-SNAPSHOT values, since OSGi doesn't like them.
* Build:
{noformat}
(cd hadoop-mapreduce-project/hadoop-mapreduce-examples/lib-v1; mvn install)
(cd hadoop-mapreduce-project/hadoop-mapreduce-examples/lib-v2; mvn install)
mvn clean install -DskipTests
(cd
hadoop-mapreduce-project/hadoop-mapreduce-examples/class-isolation-example/;
mvn install)
mvn package -Pdist -DskipTests -Dtar
{noformat}
* Install the tarball and run
{noformat}
bin/hadoop fs -mkdir -p input
bin/hadoop fs -put /usr/share/dict/words input
bin/hadoop jar
~/.m2/repository/org/apache/hadoop/class-isolation-example/1.0-SNAPSHOT/class-isolation-example-1.0-SNAPSHOT.jar
org.apache.hadoop.examples.classisolation.Driver input output
{noformat}
Still to do/future improvements:
* Make compatible with MAPREDUCE-4068.
* Write a unit test.
* Currently only the Mapper is loaded using an OSGi service - extend the
approach for all user-defined classes in a MR job.
* Use OSGi fragments so that user job JARs don't need a Registrar class, since
it would be a part of the host bundle that the job JAR extends.
* Write a utility to convert existing job JARs to OSGi bundles (or fragments).
> User supplied dependencies may conflict with MapReduce system JARs
> ------------------------------------------------------------------
>
> Key: MAPREDUCE-1700
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1700
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: task
> Reporter: Tom White
> Attachments: MAPREDUCE-1700.patch
>
>
> If user code has a dependency on a version of a JAR that is different to the
> one that happens to be used by Hadoop, then it may not work correctly. This
> happened with user code using a different version of Avro, as reported
> [here|https://issues.apache.org/jira/browse/AVRO-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852081#action_12852081].
> The problem is analogous to the one that application servers have with WAR
> loading. Using a specialized classloader in the Child JVM is probably the way
> to solve this.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira