[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chuan Liu updated MAPREDUCE-4374:
---------------------------------

    Attachment: MAPREDUCE-4374-branch-1-win-2.patch

{quote}
{code}
static final String DEFAULT_HOME_DIR =
      System.getenv(Shell.WINDOWS ? "USERPROFILE" : "HOME");
{code}
{quote}
I think this is an old private constant variable that is only used in the 
TaskRunner class; also, it aligns with other private constant variables in the 
class, e.g. DEFAULT_MAPRED_ADMIN_JAVA_OPTS, DEFAULT_SHELL. So I am in favor to 
keep it here than put in Shell.

{quote}
Firstly, why do we need this when non-Windows gets away without it? It just 
seems to be pure Java code. Secondly, if we know what abstract operation FOO we 
are doing here, then can we push it into Shell.FOO()?
{quote}
About triming leading and trailing quotes on Windows, this is due to a subtle 
difference of ‘set’ on cmd shell and ‘export’ on bash shell. According to the 
[set document|http://technet.microsoft.com/en-us/library/bb490998] which I 
quoted below:
??The characters <, >, |, &, ^ are special command shell characters and must be 
either preceded by the escape character (^) or enclosed in quotation marks when 
used in string (that is, "StringContaining&Symbol". If you use quotation marks 
to enclose a string containing one of the special characters, *the quotation 
marks are set as part of the environment variable value*.??
On Linux, quotation marks are not set as part of the environment variable by 
‘export’.
To launch child tasks with correct environment, we setup the environment by 
writing a series of ‘set’ or ‘export’ command (cf. TaskRunner.run() method) to 
TaskController.COMMAND_FILE, and execute the file to launch the Java job (cf. 
DefaultTaskController.launchTask() method). So when we receive the environment 
variables on Windows, we need to trim the leading and ending quotes around them 
in the program.
These are not perfect solutions. However notice this is the existing behavior 
from the previous change which may work for majority of cases. I only changed 
the code to make it more concise. Let’s not push this to Shell as I think the 
code is simply enough to be understood and we can work towards a better 
abstraction for this (handling ‘set’ vs ‘export’) in the future.

{quote}
Why this code addition? Is this a general bug you have found?
{quote}
Notice this is tmp directory. On Linux /tmp exists by default which is not the 
case on Windows.

{quote}
Should be easy to get the regex pattern for env vars from Shell?
{quote}
Moved to Shell in the new Patch.

{quote}
Should easily move inside Shell.getTempPath()?
{quote}
The folders here are really only just text string. The test is really just 
testing if the environment variable is set correctly. We make them folder names 
to make them more like real world scenario because that is a major use of 
environment variable in practice.

{quote}
This could easily use Shell.getUserHome()?
{quote}
Again, the purpose here is not to get user home, but to test an environment 
variable. I don’t think there is a need to create Shell function for this. The 
environment variables are very platform dependent. So I think we should accept 
the platform dependence here.

                
> Fix child task environment variable config and add support for Windows
> ----------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4374
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4374
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 1-win
>            Reporter: Chuan Liu
>            Assignee: Chuan Liu
>            Priority: Minor
>         Attachments: MAPREDUCE-4374-branch-1-win-2.patch, 
> MAPREDUCE-4374-branch-1-win.patch
>
>
> In HADOOP-2838, a new feature was introduced to set environment variables via 
> the Hadoop config 'mapred.child.env' for child tasks. There are some further 
> fixes and improvements around this feature, e.g. HADOOP-5981 were a bug fix; 
> MAPREDUCE-478 broke the config into 'mapred.map.child.env' and 
> 'mapred.reduce.child.env'.  However the current implementation is still not 
> complete. It does not match its documentation or original intend as I 
> believe. Also, by using ‘:’ (colon) and ‘;’ (semicolon) in the configuration 
> syntax, we will have problems using them on Windows because ‘:’ appears very 
> often in Windows path as in “C:\”, and environment variables are used very 
> often to hold path names. The Jira is created to fix the problem and provide 
> support on Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to