keith-turner commented on a change in pull request #49: Updates for starting
MapReduce jobs from randomwalk
URL: https://github.com/apache/accumulo-testing/pull/49#discussion_r245019316
##########
File path: src/main/java/org/apache/accumulo/testing/TestEnv.java
##########
@@ -97,6 +97,13 @@ public Configuration getHadoopConfiguration() {
hadoopConfig.set("fs.file.impl",
org.apache.hadoop.fs.LocalFileSystem.class.getName());
hadoopConfig.set("mapreduce.framework.name", "yarn");
hadoopConfig.set("yarn.resourcemanager.hostname",
getYarnResourceManager());
+ String hadoopHome = System.getenv("HADOOP_HOME");
+ if (hadoopHome == null) {
+ throw new IllegalArgumentException("HADOOP_HOME must be set in env");
+ }
+ hadoopConfig.set("yarn.app.mapreduce.am.env", "HADOOP_MAPRED_HOME=" +
hadoopHome);
Review comment:
I have been searching for a long time to no avail trying to understand what
the expectations for these settings are and what best practices are for Hadoop.
It seems odd that a client submitting a job has to know where the map reduce
jars are on the cluster.
I am curious if `HADOOP_MAPRED_HOME` is set in `hadoop-env.sh` on the
cluster if the client does not have to set it. I have not been able to find
any definitive docs on this. If I get a chance I may try experimenting with
this.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services