[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated MAPREDUCE-4987:
-------------------------------------

    Attachment: MAPREDUCE-4987.1.patch

I'm attaching a patch.  This fixes the issue of symlink handling on Windows by 
copying the files instead of truly symlinking, similar to the approach taken in 
prior patches like HADOOP-9061.  This also fixes the logic for bundling the 
classpath into a jar manifest by guaranteeing that localized resources get 
added to the classpath, even if those localized resource don't exist in the 
container path yet.  (The classpath jar must get created before the container 
launch script runs to symlink or copy files from filecache, so this was a 
chicken-and-egg problem.)  With these changes in place, 
{{TestMRJobs#testDistributedCache}} passes on Mac and Windows.

Here is a summary of the changes in each file:

{{FileUtil#createJarWithClassPath}} - Accept environment provided by caller, 
because YARN will construct an environment different from the current system 
environment.  Provide a way to maintain a classpath entry with a trailing '/' 
even though the directory doesn't exist, because the container launch script 
hasn't run yet.

{{TestFileUtil#testCreateJarWithClassPath}} - Change test to cover new logic.

{{TestMRJobs}} - Initialize {{MiniDFSCluster}} in a @BeforeClass method instead 
of a static initialization block.  This test uses an inner class, 
{{DistributedCacheChecker}}, as the job's mapper.  Since this is an inner 
class, it has a back-reference to the {{TestMRJobs}} class.  This means that 
the {{TestMRJobs}} static initialization runs for each mapper task in addition 
to running in the JUnit runner.  Therefore, this would start multiple instances 
of {{MiniDFSCluster}} pointing at the same directories, which would sometimes 
cause deadlocks.  Moving the initialization to a @BeforeClass method prevents 
it from running in the mappers.  I also needed to add a special check that a 
path is a symlinked directory, because {{FileUtils#isSymlink}} does not work as 
expected on Windows.

{{ContainerLaunch}} - Copy files instead of symlinking on Windows.  Guarantee 
that localized resources get added to the classpath correctly, even if the 
paths do not exist yet.

                
> TestMRJobs#testDistributedCache fails on Windows due to unexpected behavior 
> of symlinks
> ---------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4987
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4987
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: distributed-cache, nodemanager
>    Affects Versions: 3.0.0
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>         Attachments: MAPREDUCE-4987.1.patch
>
>
> On Windows, {{TestMRJobs#testDistributedCache}} fails on an assertion while 
> checking the length of a symlink.  It expects to see the length of the target 
> of the symlink, but Java 6 on Windows always reports that a symlink has 
> length 0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to