[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106716#comment-14106716
 ] 

Hudson commented on HADOOP-10893:
-

FAILURE: Integrated in Hadoop-Yarn-trunk #654 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/654/])
HADOOP-10893. isolated classloader on the client side. Contributed by Sangjin 
Lee (jlowe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1619604)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/dev-support/findbugsExcludeFile.xml
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-config.cmd
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-functions.sh
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop.cmd
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ApplicationClassLoader.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/RunJar.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheck.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheckMain.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheckSecond.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheckThird.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestApplicationClassLoader.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestRunJar.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/util/TestMRApps.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobs.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ApplicationClassLoader.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestApplicationClassLoader.java


 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Fix For: 3.0.0, 2.6.0

 Attachments: HADOOP-10893-branch-2.patch, 
 HADOOP-10893-branch-2.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106859#comment-14106859
 ] 

Hudson commented on HADOOP-10893:
-

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1845 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1845/])
HADOOP-10893. isolated classloader on the client side. Contributed by Sangjin 
Lee (jlowe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1619604)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/dev-support/findbugsExcludeFile.xml
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-config.cmd
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-functions.sh
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop.cmd
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ApplicationClassLoader.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/RunJar.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheck.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheckMain.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheckSecond.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheckThird.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestApplicationClassLoader.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestRunJar.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/util/TestMRApps.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobs.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ApplicationClassLoader.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestApplicationClassLoader.java


 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Fix For: 3.0.0, 2.6.0

 Attachments: HADOOP-10893-branch-2.patch, 
 HADOOP-10893-branch-2.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106931#comment-14106931
 ] 

Hudson commented on HADOOP-10893:
-

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1871 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1871/])
HADOOP-10893. isolated classloader on the client side. Contributed by Sangjin 
Lee (jlowe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1619604)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/dev-support/findbugsExcludeFile.xml
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-config.cmd
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-functions.sh
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop.cmd
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ApplicationClassLoader.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/RunJar.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheck.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheckMain.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheckSecond.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheckThird.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestApplicationClassLoader.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestRunJar.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/util/TestMRApps.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobs.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ApplicationClassLoader.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestApplicationClassLoader.java


 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Fix For: 3.0.0, 2.6.0

 Attachments: HADOOP-10893-branch-2.patch, 
 HADOOP-10893-branch-2.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-21 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14105697#comment-14105697
 ] 

Jason Lowe commented on HADOOP-10893:
-

Thanks for updating the branch-2 patch, Sangjin.

I noticed that in that patch hadoop.cmd doesn't have the same comments as were 
added for the new env variables in the trunk patch.  Could you update the 
branch-2 patch to add that as well for consistency?  Similarly there were 
comments and a example exports added to hadoop-env.sh in the trunk patch that 
are missing from the branch-2 patch.  Not sure if the comments in 
hadoop-config.sh should be moved/copied to hadoop-env.sh and sample exports 
added or if we want them left in hadoop-config.sh for branch-2.

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893-branch-2.patch, 
 HADOOP-10893-branch-2.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-21 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14105724#comment-14105724
 ] 

Sangjin Lee commented on HADOOP-10893:
--

The comments in hadoop-env.sh in the trunk patch are found in hadoop-config.sh 
in the branch-2 patch. It appears that the existing comments pre-HADOOP-9902 in 
hadoop-config.sh were replaced by (shorter) comments in hadoop-env.sh. 
Hadoop-env.sh did not have comments on things like HADOOP_USER_CLASSPATH_FIRST 
prior to HADOOP-9902. Following the movement, I basically moved the new 
comments from hadoop-config.sh to hadoop-env.sh. If we were to move those 
comments to hadoop-env.sh in branch-2, then I think we'd need to move other 
existing comments similarly, or the new comments would look out of place. 
Thoughts?

Also, prior to HADOOP-9902, hadoop-config.cmd did not have the mirroring 
comments for the environment variables. That kind of felt awkward but again I 
felt that either we recreate the same comments for all environment variables or 
none. Now that has been corrected with HADOOP-9902, I decided to add the same 
comments to hadoop.cmd in the trunk, thus the discrepancy between branch-2 and 
trunk.

Let me know what you think about these two... Thanks!

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893-branch-2.patch, 
 HADOOP-10893-branch-2.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-21 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14105819#comment-14105819
 ] 

Allen Wittenauer commented on HADOOP-10893:
---

tl;dr: [~sjlee0]'s changes here are probably the correct ones. 

From a pure patch perspective, it does look weird.  But from a stylistic 
perspective as a part of a total work (namely, hadoop 2.x), the changes and 
lack of documentation in hadoop-env.sh, etc, to branch-2 make a lot of sense. 
One of the key points of HADOOP-9902 was to highlight to end users what things 
they could set.  Hiding that in hadoop-config.sh, which users are never 
directed to documentation-wise, didn't really work. So I pulled those out and 
popped them into hadoop-env.sh, which users definitely see.

This change just got caught in the crossfire.

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893-branch-2.patch, 
 HADOOP-10893-branch-2.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-21 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14105830#comment-14105830
 ] 

Jason Lowe commented on HADOOP-10893:
-

Sounds good, thanks for the clarification Sangjin and Allen!

+1 for the latest patches, will commit this in a bit.

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893-branch-2.patch, 
 HADOOP-10893-branch-2.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-21 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14105858#comment-14105858
 ] 

Sangjin Lee commented on HADOOP-10893:
--

Thanks both!

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893-branch-2.patch, 
 HADOOP-10893-branch-2.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-21 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106046#comment-14106046
 ] 

Sangjin Lee commented on HADOOP-10893:
--

Thanks!

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Fix For: 3.0.0, 2.6.0

 Attachments: HADOOP-10893-branch-2.patch, 
 HADOOP-10893-branch-2.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106195#comment-14106195
 ] 

Hudson commented on HADOOP-10893:
-

FAILURE: Integrated in Hadoop-trunk-Commit #6093 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6093/])
HADOOP-10893. isolated classloader on the client side. Contributed by Sangjin 
Lee (jlowe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1619604)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/dev-support/findbugsExcludeFile.xml
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-config.cmd
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-functions.sh
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop.cmd
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ApplicationClassLoader.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/RunJar.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheck.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheckMain.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheckSecond.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/ClassLoaderCheckThird.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestApplicationClassLoader.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestRunJar.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/util/TestMRApps.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobs.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ApplicationClassLoader.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestApplicationClassLoader.java


 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Fix For: 3.0.0, 2.6.0

 Attachments: HADOOP-10893-branch-2.patch, 
 HADOOP-10893-branch-2.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14103532#comment-14103532
 ] 

Hadoop QA commented on HADOOP-10893:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12662951/HADOOP-10893.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common:

  org.apache.hadoop.ha.TestActiveStandbyElector

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4511//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4511//console

This message is automatically generated.

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893-branch-2.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-20 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14103886#comment-14103886
 ] 

Allen Wittenauer commented on HADOOP-10893:
---

Yeah, the export line examples make a huge difference.  Thanks.

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893-branch-2.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-20 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14103995#comment-14103995
 ] 

Jason Lowe commented on HADOOP-10893:
-

Latest trunk patch looks good.  Sangjin, could you update the branch-2 patch 
accordingly?  After that I think this is ready to commit.

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893-branch-2.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14104414#comment-14104414
 ] 

Hadoop QA commented on HADOOP-10893:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12663145/HADOOP-10893-branch-2.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4519//console

This message is automatically generated.

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893-branch-2.patch, 
 HADOOP-10893-branch-2.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-19 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102227#comment-14102227
 ] 

Kihwal Lee commented on HADOOP-10893:
-

+1 the patch looks good. [~jlowe], do you have any further comments?

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-19 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102842#comment-14102842
 ] 

Jason Lowe commented on HADOOP-10893:
-

The patch no longer applies after HADOOP-9902, and since that only went into 
trunk we'll need a separate patch for branch-2.

I tried to kick the tires on the latest patch but the client classloader never 
activated even though I set HADOOP_USE_CLIENT_CLASSLOADER=true.  That's because 
the following code will always return false:
{code}
  boolean useClientClassLoader() {
return Boolean.getBoolean(System.getenv(HADOOP_USE_CLIENT_CLASSLOADER));
  }
{code}
getBoolean looks up the value of the specified system property, whereas 
parseBoolean tries to parse the given string as a boolean.

Other comments are minor or nits:

In hadoop-config.sh The system classes are A comma-separated list s/b The 
system classes are a comma-separated list.

It would be nice if TestMain, TestSecond, TestThird were a bit less generically 
named since they are for a very specific test, e.g.: ClassLoaderCheckAppMain, 
ClassLoaderCheckAppSecond, ClassLoaderCheckAppThird, etc..  Not a must-fix 
though, rather thinking people may wonder what the names mean when they run 
across them in the source since TestMain sounds pretty generic.

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-19 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102898#comment-14102898
 ] 

Sangjin Lee commented on HADOOP-10893:
--

Thanks Jason. I'll address those issues, and upload a new patch (and add a 
separate patch for branch-2). It was on oversight on my part.

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-19 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14103194#comment-14103194
 ] 

Allen Wittenauer commented on HADOOP-10893:
---

{code}
+# If HADOOP_USE_CLIENT_CLASSLOADER is set, user classes and their dependencies
+# as defined by HADOOP_CLASSPATH and the jar as the hadoop jar argument are
+# loaded by a separate classloader. It should not be mixed with
+# HADOOP_USER_CLASSPATH_FIRST. If it is set, HADOOP_USER_CLASSPATH_FIRST is
+# ignored. Can be defined by doing
+# export HADOOP_USE_CLIENT_CLASSLOADER=true
+
+# HADOOP_CLIENT_CLASSLOADER_SYSTEM_CLASSES overrides the default definition of
+# system classes for the client classloader. The system classes are a
+# comma-separated list of classes that should be loaded from the system
+# classpath, not the user-supplied JARs, when HADOOP_USE_CLIENT_CLASSLOADER is
+# enabled. Names ending in '.' (period) are treated as package names, and names
+# starting with a '-' are treated as negative matches.
+
{code}

I'm not a fan of this wall of text sitting in hadoop-env.sh.  Ideally, this 
should really be in documentation with a very light description here; that 
second paragraph seems too much.  Additionally, burying the variable in the 
middle of the description is confusing.  It should be the last thing in the 
section so that it is clear that's what one needs to change. In other words, 
follow the pattern established elsewhere.

The change to hadoop_add_to_classpath_userpath looks fine, based upon my 
understanding of what this patch is doing.

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-19 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14103196#comment-14103196
 ] 

Allen Wittenauer commented on HADOOP-10893:
---

OK, I see the mistake I made.  There is no example export line for 
HADOOP_CLIENT_CLASSLOADER_SYSTEM_CLASSES so I thought it was still describing 
the first one. So yeah, add that instead. ;)

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-19 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14103212#comment-14103212
 ] 

Sangjin Lee commented on HADOOP-10893:
--

Thanks for the review Allen. I didn't find a suitable home for the description, 
and was following the convention prior to the change. Now that the pattern has 
changed, let me see if I can be more concise here. I'll also add an example 
export line for the latter variable.

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893-branch-2.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-18 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100729#comment-14100729
 ] 

Kihwal Lee commented on HADOOP-10893:
-

The latest patch looks good. One thing I am not sure about is leaving the conf 
in mapred-default.xml.  Since the default system classes are in the code, the 
entry in mapred-site.xml kind of serves as documentation, which needs to be 
kept in sync with the code. From pure functionality point of view, we can 
simply remove them, but then those who want to modify the list need to look at 
the code?  Maybe we can have MR logs show list of system classes when the app 
class loader is activated. Then it might be easier for users to figure out the 
default/current list and modify. Any other thoughts?

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-18 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100931#comment-14100931
 ] 

Sangjin Lee commented on HADOOP-10893:
--

Thanks for the comment, Kihwal.

I agree that if the default is now in the source directly the need to re-define 
the same default in mapred-default.xml is less than optimal. I like the idea of 
printing out the system classes when the application classloader is 
instantiated.

Having said that, how about the definition of the environment variable on the 
client classloader usage side (which was added in the latest patch)? To be 
symmetric, I think it should be removed again as well.

Thoughts?

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-18 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100957#comment-14100957
 ] 

Sangjin Lee commented on HADOOP-10893:
--

On the other hand, mapred-default.xml had a good description on the format of 
the system classes value:

{panel}
A comma-separated list of classes that should be loaded from the system 
classpath, not the user-supplied JARs, when mapreduce.job.classloader is 
enabled. Names ending in '.' (period) are treated as package names, and names 
starting with a '-' are treated as negative matches.
{panel}

We could move that to the javadoc of ApplicationClassLoader, but that's a 
little less than satisfying, as users (not developers) are the ones who need to 
override this value.

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-18 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100979#comment-14100979
 ] 

Sangjin Lee commented on HADOOP-10893:
--

I suppose we can remove the (redundant) value but keep the description. I'll 
post an updated patch shortly.

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101796#comment-14101796
 ] 

Hadoop QA commented on HADOOP-10893:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12662565/HADOOP-10893.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common:

  org.apache.hadoop.ipc.TestDecayRpcScheduler
  org.apache.hadoop.ipc.TestIPC

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4499//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4499//console

This message is automatically generated.

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-18 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101839#comment-14101839
 ] 

Sangjin Lee commented on HADOOP-10893:
--

I don't believe the test failures are related to the patch.

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, HADOOP-10893.patch, 
 classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099276#comment-14099276
 ] 

Hadoop QA commented on HADOOP-10893:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12662114/HADOOP-10893.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4482//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4482//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4482//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4482//console

This message is automatically generated.

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099429#comment-14099429
 ] 

Hadoop QA commented on HADOOP-10893:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12662190/HADOOP-10893.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common:

  org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl

  The test build failed in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4483//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4483//console

This message is automatically generated.

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099432#comment-14099432
 ] 

Hadoop QA commented on HADOOP-10893:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12662190/HADOOP-10893.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4484//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4484//console

This message is automatically generated.

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893.patch, HADOOP-10893.patch, 
 HADOOP-10893.patch, HADOOP-10893.patch, classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-13 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096280#comment-14096280
 ] 

Jason Lowe commented on HADOOP-10893:
-

Thanks, Sangjin!  Some comments on the patch:

HADOOP_USE_CLIENT_CLASSLOADER would be a better name to keep within the 
HADOOP_* shell variable namespace and is consistent with 
HADOOP_USER_CLASSPATH_FIRST.  Similarly HADOOP_CLIENT_SYSTEM_CLASSES.

ApplicationClassLoader was marked Public, so I'm wondering if we should leave a 
deprecated, trivial derivation of the new class location just in case someone 
referenced it?

What was the rationale behind the Splitter change which seems unrelated?

Would be nice if we could somehow tie the default system classes defined in 
RunJar with the default for the job classloader so we don't have to remember to 
change it in two places going forward.  Unfortunately the job classloader one 
is encoded in mapred-default.xml, so I don't know of a good way to do this 
offhand.  Any ideas?

The doc comments in hadoop-config.sh should mention the client system classes 
variable, how to use it, and potentially even its default value.  I know, I 
know.  Yet another place to update if it changes, but users will likely have 
easy access to the config comment and not the java/javadoc for RunJar.  Or 
maybe the default should already be in hadoop-config.sh with a hardcoded, 
last-resort fallback in RunJar if not set in hadoop-config.sh?  Anyway we 
should at least mention the ability to specify the system classes.

Would be nice if we could have a unit test to verify the functionality is 
working going forward.  Maybe a unit test that writes out some app code in a 
jar, has RunJar run it with the client classloader, and the app code verifies 
it has appropriate classpath semantics?  Thinking something along the lines of 
how TestApplicationClassloader works but verifying RunJar setup the 
classloaders properly.

Nit: Not thrilled to see that the variable just has to be defined to anything, 
although I see HADOOP_USER_CLASSPATH_FIRST set a precedent for it.  Leads to 
unexpected behavior if a user sees something like 
HADOOP_USER_CLASSPATH_FIRST=true and tries HADOOP_USER_CLASSPATH_FIRST=false.  
Not a must-fix, but it'd be nice to only accept expected values for the 
variable.  A shell func to sanity-check a boolean env would be helpful, maybe 
something to tackle in a followup JIRA.


 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893.patch, HADOOP-10893.patch, 
 classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-08-13 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096331#comment-14096331
 ] 

Sangjin Lee commented on HADOOP-10893:
--

Thanks for the review Jason! It's helpful as always.

bq. HADOOP_USE_CLIENT_CLASSLOADER would be a better name to keep within the 
HADOOP_* shell variable namespace and is consistent with 
HADOOP_USER_CLASSPATH_FIRST. Similarly HADOOP_CLIENT_SYSTEM_CLASSES.

Sounds good. I'll make the change.

bq. ApplicationClassLoader was marked Public, so I'm wondering if we should 
leave a deprecated, trivial derivation of the new class location just in case 
someone referenced it?

I did not notice that it was marked public. I'll recreate a deprecated 
extending class in its current location.

bq. What was the rationale behind the Splitter change which seems unrelated?

If possible, I wanted to avoid having a dependency from this classloader class 
to another library unless it's really necessary. Splitter was coming from 
guava. :) In theory it should be OK even if ApplicationClassLoader used a guava 
class. It would be loaded by the system classloader anyway, and it would not 
interfere with the ApplicationClassLoader's ability to load a new version of 
the class for the user.

However, it was more of a call to minimize the external dependency from 
ApplicationClassLoader. I believe the current version (using String.split()) is 
equivalent and using the Splitter is not needed, but I'd be open to reversing 
it.

{quote}
Would be nice if we could somehow tie the default system classes defined in 
RunJar with the default for the job classloader so we don't have to remember to 
change it in two places going forward. Unfortunately the job classloader one is 
encoded in mapred-default.xml, so I don't know of a good way to do this 
offhand. Any ideas?
{quote}

I struggled with that decision a bit. As you mentioned, if you want to override 
the defaults, you'd need to do it in two places if you use it for the client 
and for the tasks as well (and for the vast majority of the cases I would 
imagine that is the case).

At least I feel that it would be better if at least the default is in one 
place. In that sense, how about having the default in ApplicationClassLoader 
itself? You still need to override it in two places, but it feels like an 
improvement over the current version.

{quote}
The doc comments in hadoop-config.sh should mention the client system classes 
variable, how to use it, and potentially even its default value. I know, I 
know. Yet another place to update if it changes, but users will likely have 
easy access to the config comment and not the java/javadoc for RunJar. Or maybe 
the default should already be in hadoop-config.sh with a hardcoded, last-resort 
fallback in RunJar if not set in hadoop-config.sh? Anyway we should at least 
mention the ability to specify the system classes.
{quote}

I agree it would be good to document the usage of the system classes env 
variable. I'll add the comment to hadoop-config.sh. See above for where to 
define the default and let me know what you think.

{quote}
Would be nice if we could have a unit test to verify the functionality is 
working going forward. Maybe a unit test that writes out some app code in a 
jar, has RunJar run it with the client classloader, and the app code verifies 
it has appropriate classpath semantics? Thinking something along the lines of 
how TestApplicationClassloader works but verifying RunJar setup the 
classloaders properly.
{quote}

Let me look into a unit test for this that involves RunJar. Do you happen to 
know of an existing test that writes out classes/jars off the top of your head?

{quote}
Nit: Not thrilled to see that the variable just has to be defined to anything, 
although I see HADOOP_USER_CLASSPATH_FIRST set a precedent for it. Leads to 
unexpected behavior if a user sees something like 
HADOOP_USER_CLASSPATH_FIRST=true and tries HADOOP_USER_CLASSPATH_FIRST=false. 
Not a must-fix, but it'd be nice to only accept expected values for the 
variable. A shell func to sanity-check a boolean env would be helpful, maybe 
something to tackle in a followup JIRA.
{quote}

Yes I stumbled on that as well, and it struck me as an odd behavior. I think 
I'll file a separate JIRA to tackle that issue...

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893.patch, HADOOP-10893.patch, 
 classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate 

[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-07-30 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078974#comment-14078974
 ] 

Sangjin Lee commented on HADOOP-10893:
--

Any feedback on this is greatly appreciated. Thanks!

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893.patch, HADOOP-10893.patch, 
 classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-07-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076687#comment-14076687
 ] 

Hadoop QA commented on HADOOP-10893:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12658222/classloader-test.tar.gz
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4372//console

This message is automatically generated.

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893.patch, classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-07-28 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076761#comment-14076761
 ] 

Sangjin Lee commented on HADOOP-10893:
--

I have posted the patch for using the isolated classloader on the client side. 
I've tested it with a simple test driver (I'll post it once the jenkins goes 
through the current patch) to verify that the user code and its dependencies 
are loaded through the application classloader, and hadoop can load different 
versions of the same dependencies than the user dependencies.

Some key points about the patch:
- I have moved org.apache.hadoop.yarn.util.ApplicationClassLoader to 
hadoop-common so it can be used by the client-side: ApplicationClassLoader is 
good enough for the client too
- the feature is enabled by setting an environment variable: this is in keeping 
with the USER_CLASSPATH_FIRST behavior
- it also has the system classes, which can be overridden via an environment 
variable

It turns out to be bit simpler than I initially expected. The situation is 
pretty similar to (but not entirely the same as) the YarnChild case.

I've also tested a real job submission of a fairly small app with several 
dependencies.

i'd love to hear feedback on the patch. Thanks!

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893.patch


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-07-28 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076777#comment-14076777
 ] 

Sangjin Lee commented on HADOOP-10893:
--

Incidentally, I find USER_CLASSPATH_FIRST to be somewhat broken. It applies to 
what's being provided through the CLASSPATH environment variable. However, it 
does not apply to what's being provided through the jar itself (it always comes 
last in the class search).

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893.patch


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-07-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077001#comment-14077001
 ] 

Hadoop QA commented on HADOOP-10893:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12658221/HADOOP-10893.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4374//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4374//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4374//console

This message is automatically generated.

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893.patch


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-07-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077252#comment-14077252
 ] 

Hadoop QA commented on HADOOP-10893:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12658293/HADOOP-10893.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4376//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4376//console

This message is automatically generated.

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893.patch, HADOOP-10893.patch


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10893) isolated classloader on the client side

2014-07-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077355#comment-14077355
 ] 

Hadoop QA commented on HADOOP-10893:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12658339/classloader-test.tar.gz
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4378//console

This message is automatically generated.

 isolated classloader on the client side
 ---

 Key: HADOOP-10893
 URL: https://issues.apache.org/jira/browse/HADOOP-10893
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: HADOOP-10893.patch, HADOOP-10893.patch, 
 classloader-test.tar.gz


 We have the job classloader on the mapreduce tasks that run on the cluster. 
 It has a benefit of being able to isolate class space for user code and avoid 
 version clashes.
 Although it occurs less often, version clashes do occur on the client JVM. It 
 would be good to introduce an isolated classloader on the client side as well 
 to address this. A natural point to introduce this may be through RunJar, as 
 that's how most of hadoop jobs are run.



--
This message was sent by Atlassian JIRA
(v6.2#6252)