[ 
https://issues.apache.org/jira/browse/HADOOP-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrian Woodhead updated HADOOP-1642:
------------------------------------

    Attachment: HADOOP-1642-3.patch

Code Review please. 

The actual change to the source code is very small, just a one liner 
modification on line 117 of LocalJobRunner.java like so:

          String mapId = jobId + "_map_" + idFormat.format(i);

The problem was that in local MR and FS mode, the mapId wasn't unique enough 
and caused tasks to create .out files with the same name. Prepending the jobId 
as shown above fixes this. Maybe there is a better way to do this that is more 
consistent with how it is done in the distributed Job Runner? Or using a GUID 
of some sort? The above worked for me but if you have a preference for another 
way of generating the mapId I'm keen to hear it.

The bulk of the patch consists of adding a Unit test for this. What I have done 
is create a TestLocalJobControl which extends HadoopTestCase and sets up a 
mini-mr cluster in Local MR and File mode. I based the test on  TestJobControl 
(which is a bit of a weird unit test IMHO as it does no asserts, surely it 
should check the job results?) . Anyway, I factored out all the common 
functionality into a JobControlTestUtils class and changed both tests to use 
this. My test failed with the 

java.io.IOException: Target build/test/mapred/local/map_0000/file.out

message until the change to LocalJobRunner was made so I think it covers this 
issue.

> Jobs using LocalJobRunner + JobControl fails
> --------------------------------------------
>
>                 Key: HADOOP-1642
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1642
>             Project: Hadoop
>          Issue Type: Bug
>    Affects Versions: 0.13.0
>            Reporter: Johan Oskarsson
>            Assignee: Doug Cutting
>            Priority: Critical
>             Fix For: 0.16.0
>
>         Attachments: HADOOP-1642-2.patch, HADOOP-1642-3.patch, 
> HADOOP-1642.patch
>
>
> If I run several jobs at the same time using JobControl and the 
> LocalJobRunner i get:
> java.io.IOException: Target 
> /tmp/hadoop-johan/mapred/local/localRunner/job_local_1.xml already exists.
> It seems like the JobControl class tries to run multiple jobs with the same 
> jobid, causing the exception.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to