[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106716#comment-14106716 ] Hudson commented on HADOOP-10893: - FAILURE: Integrated in Hadoop-Yarn-trunk #654 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/654/]) HADOOP-10893. isolated classloader on the client side. Contributed by Sangjin Lee (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1619604) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/dev-support/findbugsExcludeFile.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-config.cmd * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-functions.sh * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop.cmd * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ApplicationClassLoader.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/RunJar.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheck.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheckMain.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheckSecond.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheckThird.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestApplicationClassLoader.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestRunJar.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/util/TestMRApps.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobs.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ApplicationClassLoader.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestApplicationClassLoader.java isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Fix For: 3.0.0, 2.6.0 Attachments: HADOOP-10893-branch-2.patch, HADOOP-10893-branch-2.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106859#comment-14106859 ] Hudson commented on HADOOP-10893: - SUCCESS: Integrated in Hadoop-Hdfs-trunk #1845 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1845/]) HADOOP-10893. isolated classloader on the client side. Contributed by Sangjin Lee (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1619604) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/dev-support/findbugsExcludeFile.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-config.cmd * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-functions.sh * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop.cmd * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ApplicationClassLoader.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/RunJar.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheck.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheckMain.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheckSecond.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheckThird.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestApplicationClassLoader.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestRunJar.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/util/TestMRApps.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobs.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ApplicationClassLoader.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestApplicationClassLoader.java isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Fix For: 3.0.0, 2.6.0 Attachments: HADOOP-10893-branch-2.patch, HADOOP-10893-branch-2.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106931#comment-14106931 ] Hudson commented on HADOOP-10893: - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1871 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1871/]) HADOOP-10893. isolated classloader on the client side. Contributed by Sangjin Lee (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1619604) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/dev-support/findbugsExcludeFile.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-config.cmd * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-functions.sh * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop.cmd * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ApplicationClassLoader.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/RunJar.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheck.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheckMain.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheckSecond.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheckThird.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestApplicationClassLoader.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestRunJar.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/util/TestMRApps.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobs.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ApplicationClassLoader.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestApplicationClassLoader.java isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Fix For: 3.0.0, 2.6.0 Attachments: HADOOP-10893-branch-2.patch, HADOOP-10893-branch-2.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14105697#comment-14105697 ] Jason Lowe commented on HADOOP-10893: - Thanks for updating the branch-2 patch, Sangjin. I noticed that in that patch hadoop.cmd doesn't have the same comments as were added for the new env variables in the trunk patch. Could you update the branch-2 patch to add that as well for consistency? Similarly there were comments and a example exports added to hadoop-env.sh in the trunk patch that are missing from the branch-2 patch. Not sure if the comments in hadoop-config.sh should be moved/copied to hadoop-env.sh and sample exports added or if we want them left in hadoop-config.sh for branch-2. isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893-branch-2.patch, HADOOP-10893-branch-2.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14105724#comment-14105724 ] Sangjin Lee commented on HADOOP-10893: -- The comments in hadoop-env.sh in the trunk patch are found in hadoop-config.sh in the branch-2 patch. It appears that the existing comments pre-HADOOP-9902 in hadoop-config.sh were replaced by (shorter) comments in hadoop-env.sh. Hadoop-env.sh did not have comments on things like HADOOP_USER_CLASSPATH_FIRST prior to HADOOP-9902. Following the movement, I basically moved the new comments from hadoop-config.sh to hadoop-env.sh. If we were to move those comments to hadoop-env.sh in branch-2, then I think we'd need to move other existing comments similarly, or the new comments would look out of place. Thoughts? Also, prior to HADOOP-9902, hadoop-config.cmd did not have the mirroring comments for the environment variables. That kind of felt awkward but again I felt that either we recreate the same comments for all environment variables or none. Now that has been corrected with HADOOP-9902, I decided to add the same comments to hadoop.cmd in the trunk, thus the discrepancy between branch-2 and trunk. Let me know what you think about these two... Thanks! isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893-branch-2.patch, HADOOP-10893-branch-2.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14105819#comment-14105819 ] Allen Wittenauer commented on HADOOP-10893: --- tl;dr: [~sjlee0]'s changes here are probably the correct ones. From a pure patch perspective, it does look weird. But from a stylistic perspective as a part of a total work (namely, hadoop 2.x), the changes and lack of documentation in hadoop-env.sh, etc, to branch-2 make a lot of sense. One of the key points of HADOOP-9902 was to highlight to end users what things they could set. Hiding that in hadoop-config.sh, which users are never directed to documentation-wise, didn't really work. So I pulled those out and popped them into hadoop-env.sh, which users definitely see. This change just got caught in the crossfire. isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893-branch-2.patch, HADOOP-10893-branch-2.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14105830#comment-14105830 ] Jason Lowe commented on HADOOP-10893: - Sounds good, thanks for the clarification Sangjin and Allen! +1 for the latest patches, will commit this in a bit. isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893-branch-2.patch, HADOOP-10893-branch-2.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14105858#comment-14105858 ] Sangjin Lee commented on HADOOP-10893: -- Thanks both! isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893-branch-2.patch, HADOOP-10893-branch-2.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106046#comment-14106046 ] Sangjin Lee commented on HADOOP-10893: -- Thanks! isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Fix For: 3.0.0, 2.6.0 Attachments: HADOOP-10893-branch-2.patch, HADOOP-10893-branch-2.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106195#comment-14106195 ] Hudson commented on HADOOP-10893: - FAILURE: Integrated in Hadoop-trunk-Commit #6093 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6093/]) HADOOP-10893. isolated classloader on the client side. Contributed by Sangjin Lee (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1619604) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/dev-support/findbugsExcludeFile.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-config.cmd * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-functions.sh * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop.cmd * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ApplicationClassLoader.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/RunJar.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheck.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheckMain.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheckSecond.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheckThird.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestApplicationClassLoader.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestRunJar.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/util/TestMRApps.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobs.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ApplicationClassLoader.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestApplicationClassLoader.java isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Fix For: 3.0.0, 2.6.0 Attachments: HADOOP-10893-branch-2.patch, HADOOP-10893-branch-2.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14103532#comment-14103532 ] Hadoop QA commented on HADOOP-10893: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12662951/HADOOP-10893.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 8 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common: org.apache.hadoop.ha.TestActiveStandbyElector {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4511//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4511//console This message is automatically generated. isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893-branch-2.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14103886#comment-14103886 ] Allen Wittenauer commented on HADOOP-10893: --- Yeah, the export line examples make a huge difference. Thanks. isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893-branch-2.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14103995#comment-14103995 ] Jason Lowe commented on HADOOP-10893: - Latest trunk patch looks good. Sangjin, could you update the branch-2 patch accordingly? After that I think this is ready to commit. isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893-branch-2.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14104414#comment-14104414 ] Hadoop QA commented on HADOOP-10893: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12663145/HADOOP-10893-branch-2.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4519//console This message is automatically generated. isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893-branch-2.patch, HADOOP-10893-branch-2.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102227#comment-14102227 ] Kihwal Lee commented on HADOOP-10893: - +1 the patch looks good. [~jlowe], do you have any further comments? isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102842#comment-14102842 ] Jason Lowe commented on HADOOP-10893: - The patch no longer applies after HADOOP-9902, and since that only went into trunk we'll need a separate patch for branch-2. I tried to kick the tires on the latest patch but the client classloader never activated even though I set HADOOP_USE_CLIENT_CLASSLOADER=true. That's because the following code will always return false: {code} boolean useClientClassLoader() { return Boolean.getBoolean(System.getenv(HADOOP_USE_CLIENT_CLASSLOADER)); } {code} getBoolean looks up the value of the specified system property, whereas parseBoolean tries to parse the given string as a boolean. Other comments are minor or nits: In hadoop-config.sh The system classes are A comma-separated list s/b The system classes are a comma-separated list. It would be nice if TestMain, TestSecond, TestThird were a bit less generically named since they are for a very specific test, e.g.: ClassLoaderCheckAppMain, ClassLoaderCheckAppSecond, ClassLoaderCheckAppThird, etc.. Not a must-fix though, rather thinking people may wonder what the names mean when they run across them in the source since TestMain sounds pretty generic. isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102898#comment-14102898 ] Sangjin Lee commented on HADOOP-10893: -- Thanks Jason. I'll address those issues, and upload a new patch (and add a separate patch for branch-2). It was on oversight on my part. isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14103194#comment-14103194 ] Allen Wittenauer commented on HADOOP-10893: --- {code} +# If HADOOP_USE_CLIENT_CLASSLOADER is set, user classes and their dependencies +# as defined by HADOOP_CLASSPATH and the jar as the hadoop jar argument are +# loaded by a separate classloader. It should not be mixed with +# HADOOP_USER_CLASSPATH_FIRST. If it is set, HADOOP_USER_CLASSPATH_FIRST is +# ignored. Can be defined by doing +# export HADOOP_USE_CLIENT_CLASSLOADER=true + +# HADOOP_CLIENT_CLASSLOADER_SYSTEM_CLASSES overrides the default definition of +# system classes for the client classloader. The system classes are a +# comma-separated list of classes that should be loaded from the system +# classpath, not the user-supplied JARs, when HADOOP_USE_CLIENT_CLASSLOADER is +# enabled. Names ending in '.' (period) are treated as package names, and names +# starting with a '-' are treated as negative matches. + {code} I'm not a fan of this wall of text sitting in hadoop-env.sh. Ideally, this should really be in documentation with a very light description here; that second paragraph seems too much. Additionally, burying the variable in the middle of the description is confusing. It should be the last thing in the section so that it is clear that's what one needs to change. In other words, follow the pattern established elsewhere. The change to hadoop_add_to_classpath_userpath looks fine, based upon my understanding of what this patch is doing. isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14103196#comment-14103196 ] Allen Wittenauer commented on HADOOP-10893: --- OK, I see the mistake I made. There is no example export line for HADOOP_CLIENT_CLASSLOADER_SYSTEM_CLASSES so I thought it was still describing the first one. So yeah, add that instead. ;) isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14103212#comment-14103212 ] Sangjin Lee commented on HADOOP-10893: -- Thanks for the review Allen. I didn't find a suitable home for the description, and was following the convention prior to the change. Now that the pattern has changed, let me see if I can be more concise here. I'll also add an example export line for the latter variable. isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893-branch-2.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100729#comment-14100729 ] Kihwal Lee commented on HADOOP-10893: - The latest patch looks good. One thing I am not sure about is leaving the conf in mapred-default.xml. Since the default system classes are in the code, the entry in mapred-site.xml kind of serves as documentation, which needs to be kept in sync with the code. From pure functionality point of view, we can simply remove them, but then those who want to modify the list need to look at the code? Maybe we can have MR logs show list of system classes when the app class loader is activated. Then it might be easier for users to figure out the default/current list and modify. Any other thoughts? isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100931#comment-14100931 ] Sangjin Lee commented on HADOOP-10893: -- Thanks for the comment, Kihwal. I agree that if the default is now in the source directly the need to re-define the same default in mapred-default.xml is less than optimal. I like the idea of printing out the system classes when the application classloader is instantiated. Having said that, how about the definition of the environment variable on the client classloader usage side (which was added in the latest patch)? To be symmetric, I think it should be removed again as well. Thoughts? isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100957#comment-14100957 ] Sangjin Lee commented on HADOOP-10893: -- On the other hand, mapred-default.xml had a good description on the format of the system classes value: {panel} A comma-separated list of classes that should be loaded from the system classpath, not the user-supplied JARs, when mapreduce.job.classloader is enabled. Names ending in '.' (period) are treated as package names, and names starting with a '-' are treated as negative matches. {panel} We could move that to the javadoc of ApplicationClassLoader, but that's a little less than satisfying, as users (not developers) are the ones who need to override this value. isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100979#comment-14100979 ] Sangjin Lee commented on HADOOP-10893: -- I suppose we can remove the (redundant) value but keep the description. I'll post an updated patch shortly. isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101796#comment-14101796 ] Hadoop QA commented on HADOOP-10893: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12662565/HADOOP-10893.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 8 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common: org.apache.hadoop.ipc.TestDecayRpcScheduler org.apache.hadoop.ipc.TestIPC {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4499//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4499//console This message is automatically generated. isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101839#comment-14101839 ] Sangjin Lee commented on HADOOP-10893: -- I don't believe the test failures are related to the patch. isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099276#comment-14099276 ] Hadoop QA commented on HADOOP-10893: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12662114/HADOOP-10893.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 8 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4482//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/4482//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/4482//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4482//console This message is automatically generated. isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099429#comment-14099429 ] Hadoop QA commented on HADOOP-10893: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12662190/HADOOP-10893.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 8 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common: org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl The test build failed in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4483//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4483//console This message is automatically generated. isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099432#comment-14099432 ] Hadoop QA commented on HADOOP-10893: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12662190/HADOOP-10893.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 8 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4484//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4484//console This message is automatically generated. isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096280#comment-14096280 ] Jason Lowe commented on HADOOP-10893: - Thanks, Sangjin! Some comments on the patch: HADOOP_USE_CLIENT_CLASSLOADER would be a better name to keep within the HADOOP_* shell variable namespace and is consistent with HADOOP_USER_CLASSPATH_FIRST. Similarly HADOOP_CLIENT_SYSTEM_CLASSES. ApplicationClassLoader was marked Public, so I'm wondering if we should leave a deprecated, trivial derivation of the new class location just in case someone referenced it? What was the rationale behind the Splitter change which seems unrelated? Would be nice if we could somehow tie the default system classes defined in RunJar with the default for the job classloader so we don't have to remember to change it in two places going forward. Unfortunately the job classloader one is encoded in mapred-default.xml, so I don't know of a good way to do this offhand. Any ideas? The doc comments in hadoop-config.sh should mention the client system classes variable, how to use it, and potentially even its default value. I know, I know. Yet another place to update if it changes, but users will likely have easy access to the config comment and not the java/javadoc for RunJar. Or maybe the default should already be in hadoop-config.sh with a hardcoded, last-resort fallback in RunJar if not set in hadoop-config.sh? Anyway we should at least mention the ability to specify the system classes. Would be nice if we could have a unit test to verify the functionality is working going forward. Maybe a unit test that writes out some app code in a jar, has RunJar run it with the client classloader, and the app code verifies it has appropriate classpath semantics? Thinking something along the lines of how TestApplicationClassloader works but verifying RunJar setup the classloaders properly. Nit: Not thrilled to see that the variable just has to be defined to anything, although I see HADOOP_USER_CLASSPATH_FIRST set a precedent for it. Leads to unexpected behavior if a user sees something like HADOOP_USER_CLASSPATH_FIRST=true and tries HADOOP_USER_CLASSPATH_FIRST=false. Not a must-fix, but it'd be nice to only accept expected values for the variable. A shell func to sanity-check a boolean env would be helpful, maybe something to tackle in a followup JIRA. isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096331#comment-14096331 ] Sangjin Lee commented on HADOOP-10893: -- Thanks for the review Jason! It's helpful as always. bq. HADOOP_USE_CLIENT_CLASSLOADER would be a better name to keep within the HADOOP_* shell variable namespace and is consistent with HADOOP_USER_CLASSPATH_FIRST. Similarly HADOOP_CLIENT_SYSTEM_CLASSES. Sounds good. I'll make the change. bq. ApplicationClassLoader was marked Public, so I'm wondering if we should leave a deprecated, trivial derivation of the new class location just in case someone referenced it? I did not notice that it was marked public. I'll recreate a deprecated extending class in its current location. bq. What was the rationale behind the Splitter change which seems unrelated? If possible, I wanted to avoid having a dependency from this classloader class to another library unless it's really necessary. Splitter was coming from guava. :) In theory it should be OK even if ApplicationClassLoader used a guava class. It would be loaded by the system classloader anyway, and it would not interfere with the ApplicationClassLoader's ability to load a new version of the class for the user. However, it was more of a call to minimize the external dependency from ApplicationClassLoader. I believe the current version (using String.split()) is equivalent and using the Splitter is not needed, but I'd be open to reversing it. {quote} Would be nice if we could somehow tie the default system classes defined in RunJar with the default for the job classloader so we don't have to remember to change it in two places going forward. Unfortunately the job classloader one is encoded in mapred-default.xml, so I don't know of a good way to do this offhand. Any ideas? {quote} I struggled with that decision a bit. As you mentioned, if you want to override the defaults, you'd need to do it in two places if you use it for the client and for the tasks as well (and for the vast majority of the cases I would imagine that is the case). At least I feel that it would be better if at least the default is in one place. In that sense, how about having the default in ApplicationClassLoader itself? You still need to override it in two places, but it feels like an improvement over the current version. {quote} The doc comments in hadoop-config.sh should mention the client system classes variable, how to use it, and potentially even its default value. I know, I know. Yet another place to update if it changes, but users will likely have easy access to the config comment and not the java/javadoc for RunJar. Or maybe the default should already be in hadoop-config.sh with a hardcoded, last-resort fallback in RunJar if not set in hadoop-config.sh? Anyway we should at least mention the ability to specify the system classes. {quote} I agree it would be good to document the usage of the system classes env variable. I'll add the comment to hadoop-config.sh. See above for where to define the default and let me know what you think. {quote} Would be nice if we could have a unit test to verify the functionality is working going forward. Maybe a unit test that writes out some app code in a jar, has RunJar run it with the client classloader, and the app code verifies it has appropriate classpath semantics? Thinking something along the lines of how TestApplicationClassloader works but verifying RunJar setup the classloaders properly. {quote} Let me look into a unit test for this that involves RunJar. Do you happen to know of an existing test that writes out classes/jars off the top of your head? {quote} Nit: Not thrilled to see that the variable just has to be defined to anything, although I see HADOOP_USER_CLASSPATH_FIRST set a precedent for it. Leads to unexpected behavior if a user sees something like HADOOP_USER_CLASSPATH_FIRST=true and tries HADOOP_USER_CLASSPATH_FIRST=false. Not a must-fix, but it'd be nice to only accept expected values for the variable. A shell func to sanity-check a boolean env would be helpful, maybe something to tackle in a followup JIRA. {quote} Yes I stumbled on that as well, and it struck me as an odd behavior. I think I'll file a separate JIRA to tackle that issue... isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078974#comment-14078974 ] Sangjin Lee commented on HADOOP-10893: -- Any feedback on this is greatly appreciated. Thanks! isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076687#comment-14076687 ] Hadoop QA commented on HADOOP-10893: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658222/classloader-test.tar.gz against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4372//console This message is automatically generated. isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076761#comment-14076761 ] Sangjin Lee commented on HADOOP-10893: -- I have posted the patch for using the isolated classloader on the client side. I've tested it with a simple test driver (I'll post it once the jenkins goes through the current patch) to verify that the user code and its dependencies are loaded through the application classloader, and hadoop can load different versions of the same dependencies than the user dependencies. Some key points about the patch: - I have moved org.apache.hadoop.yarn.util.ApplicationClassLoader to hadoop-common so it can be used by the client-side: ApplicationClassLoader is good enough for the client too - the feature is enabled by setting an environment variable: this is in keeping with the USER_CLASSPATH_FIRST behavior - it also has the system classes, which can be overridden via an environment variable It turns out to be bit simpler than I initially expected. The situation is pretty similar to (but not entirely the same as) the YarnChild case. I've also tested a real job submission of a fairly small app with several dependencies. i'd love to hear feedback on the patch. Thanks! isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893.patch We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076777#comment-14076777 ] Sangjin Lee commented on HADOOP-10893: -- Incidentally, I find USER_CLASSPATH_FIRST to be somewhat broken. It applies to what's being provided through the CLASSPATH environment variable. However, it does not apply to what's being provided through the jar itself (it always comes last in the class search). isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893.patch We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077001#comment-14077001 ] Hadoop QA commented on HADOOP-10893: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658221/HADOOP-10893.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4374//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/4374//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4374//console This message is automatically generated. isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893.patch We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077252#comment-14077252 ] Hadoop QA commented on HADOOP-10893: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658293/HADOOP-10893.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4376//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4376//console This message is automatically generated. isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893.patch, HADOOP-10893.patch We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10893) isolated classloader on the client side
[ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077355#comment-14077355 ] Hadoop QA commented on HADOOP-10893: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658339/classloader-test.tar.gz against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4378//console This message is automatically generated. isolated classloader on the client side --- Key: HADOOP-10893 URL: https://issues.apache.org/jira/browse/HADOOP-10893 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz We have the job classloader on the mapreduce tasks that run on the cluster. It has a benefit of being able to isolate class space for user code and avoid version clashes. Although it occurs less often, version clashes do occur on the client JVM. It would be good to introduce an isolated classloader on the client side as well to address this. A natural point to introduce this may be through RunJar, as that's how most of hadoop jobs are run. -- This message was sent by Atlassian JIRA (v6.2#6252)