[GitHub] keith-turner commented on a change in pull request #49: Updates for starting MapReduce jobs from randomwalk

GitBox Thu, 03 Jan 2019 06:43:01 -0800

keith-turner commented on a change in pull request #49: Updates for starting 
MapReduce jobs from randomwalk
URL: https://github.com/apache/accumulo-testing/pull/49#discussion_r245019316


 ##########
 File path: src/main/java/org/apache/accumulo/testing/TestEnv.java
 ##########
 @@ -97,6 +97,13 @@ public Configuration getHadoopConfiguration() {
       hadoopConfig.set("fs.file.impl", 
org.apache.hadoop.fs.LocalFileSystem.class.getName());
       hadoopConfig.set("mapreduce.framework.name", "yarn");
       hadoopConfig.set("yarn.resourcemanager.hostname", 
getYarnResourceManager());
+      String hadoopHome = System.getenv("HADOOP_HOME");
+      if (hadoopHome == null) {
+        throw new IllegalArgumentException("HADOOP_HOME must be set in env");
+      }
+      hadoopConfig.set("yarn.app.mapreduce.am.env", "HADOOP_MAPRED_HOME=" + 
hadoopHome);
 
 Review comment:
   I have been searching for a long time to no avail trying to understand what 
the expectations for these settings are and what best practices are for Hadoop. 
 It seems odd that a client submitting a job has to know where the map reduce 
jars are on the cluster.
   
   I am curious if `HADOOP_MAPRED_HOME` is set in `hadoop-env.sh` on the 
cluster if the client does not have to set it.  I have not been able to find 
any definitive docs on this.  If I get a chance I may try experimenting with 
this.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] keith-turner commented on a change in pull request #49: Updates for starting MapReduce jobs from randomwalk

Reply via email to