Jason Lowe commented on MAPREDUCE-7069:

Thanks for the patch!

It's a bit odd and inefficient that setVMEnv calls 
MRApps.setEnvFromInputProperty twice.  I think it would be clearer and more 
efficient to call it once, place the results in a temporary map (like it 
already does in the second call), then only set HADOOP_ROOT_LOGGER and 
HADOOP_CLIENT_OPTS in the environment if they are not set in the temporary map. 
 Then at the end we can simply call addAll to dump the contents of the 
temporary map into the environment map.

The example documentation in JobConf is confusing.  It uses 
"MAPRED_MAP_TASK_ENV" and "MAPRED_REDUCE_TASK_ENV" but those literal strings 
should not be used in the property name.  It would be clearer if this used 
"mapreduce.map.env" and "mapreduce.reduce.env" in the examples.  Either that or 
give the example in the Java realm with something like set(MAPRED_MAP_TASK_ENV 
+ ".varName", varValue) so it's clearly not a literal string in the property 
name.  My pereference is the former.

The relevant property descriptions in mapred-default.xml should be updated to 
reflect the new functionality.

It would be good to update MapReduceTutorial.md to document the options for 
passing environment variables to tasks.

There are a number of comments in setEnvFromString that should be fixed up.  I 
realize this is mostly cut-n-paste from the old setEnvFromInputString, but 
since we're refactoring it would be nice to clean it up a bit in the process.  
There's not such thing as a tt (tasktracker) in YARN, and the comments imply 
this is only called to setup the env by a nodemanager for a child process.  
That's not always the case.  "note" s/b "not", etc.

For javadoc comments it's not necessary to state the type of the variable after 
the variable name.  Javadoc can automatically extract this from the method 

Nit: setEnvFromInputStringMap does not need to be public.

Would it be easier to call tmpEnv.addAll(inputMap) and pass tmpEnv instead of 
inputMap?  Then we don't need to explicitly iterate the map.

The unit test should add a new properies with commas and or equal signs in the 
value and verify the values come through in the environment map.

Does it make sense to split some of the unit test up into separate tests?  For 
example the null input test can easily stand by itself.  Separate tests make it 
easier to identify what's working and what's broken rather than a stacktrace 
with a line number in the middle of a large unit test that is testing many 
different aspects.

> Add ability to specify user environment variables individually
> --------------------------------------------------------------
>                 Key: MAPREDUCE-7069
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Jim Brennan
>            Assignee: Jim Brennan
>            Priority: Major
>         Attachments: MAPREDUCE-7069.001.patch, MAPREDUCE-7069.002.patch
> As reported in YARN-6830, it is currently not possible to specify an 
> environment variable that contains commas via {{mapreduce.map.env}}, 
> mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.
> To address this, [~aw] proposed in [YARN-6830] that we add the ability to 
> specify environment variables individually:
> {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
> {quote}

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to