[
https://issues.apache.org/jira/browse/MAPREDUCE-5001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sandy Ryza reassigned MAPREDUCE-5001:
-------------------------------------
Assignee: Sandy Ryza
> LocalJobRunner has race condition resulting in job failures
> ------------------------------------------------------------
>
> Key: MAPREDUCE-5001
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5001
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 2.0.2-alpha
> Reporter: Brock Noland
> Assignee: Sandy Ryza
>
> Hive is hitting a race condition with LocalJobRunner and the Cluster class.
> The JobClient uses the Cluster class to obtain Job objects. The Cluster class
> uses the job.xml file to populate the JobConf object
> (https://github.com/apache/hadoop-common/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/Cluster.java#L184).
> However, this file is deleted by the LocalJobRunner at the end of it's job
> (https://github.com/apache/hadoop-common/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapred/LocalJobRunner.java#L484).
> This results in the following exception:
> {noformat}
> 2013-02-11 14:45:17,755 (main) [FATAL -
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2001)]
> error parsing conf
> file:/tmp/hadoop-brock/mapred/staging/brock1916441210/.staging/job_local_0432/job.xml
> java.io.FileNotFoundException:
> /tmp/hadoop-brock/mapred/staging/brock1916441210/.staging/job_local_0432/job.xml
> (No such file or directory)
> at java.io.FileInputStream.open(Native Method)
> at java.io.FileInputStream.<init>(FileInputStream.java:120)
> at
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1917)
> at
> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1870)
> at
> org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1777)
> at org.apache.hadoop.conf.Configuration.get(Configuration.java:712)
> at
> org.apache.hadoop.mapred.JobConf.checkAndWarnDeprecation(JobConf.java:1951)
> at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:398)
> at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:388)
> at
> org.apache.hadoop.mapred.JobClient$NetworkedJob.<init>(JobClient.java:174)
> at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:655)
> at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:668)
> at
> org.apache.hadoop.mapreduce.TestMR2LocalMode.test(TestMR2LocalMode.java:40)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
> at
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
> at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
> at
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
> at
> org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
> at
> org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
> at
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
> at
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
> at
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
> at
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
> {noformat}
> Here is code which exposes this race fairly quickly:
> {noformat}
> Configuration conf = new Configuration();
> conf.set("mapreduce.framework.name", "local");
> conf.set("mapreduce.jobtracker.address", "local");
> File inputDir = new File("/tmp", "input-" + System.currentTimeMillis());
> File outputDir = new File("/tmp", "output-" + System.currentTimeMillis());
> while(true) {
> Assert.assertTrue(inputDir.mkdirs());
> File inputFile = new File(inputDir, "file");
> FileUtils.copyFile(new File("/etc/passwd"), inputFile);
> Path input = new Path(inputDir.getAbsolutePath());
> Path output = new Path(outputDir.getAbsolutePath());
> JobConf jobConf = new JobConf(conf, TestMR2LocalMode.class);
> FileInputFormat.addInputPath(jobConf, input);
> FileOutputFormat.setOutputPath(jobConf, output);
> JobClient jobClient = new JobClient(conf);
> RunningJob runningJob = jobClient.submitJob(jobConf);
> while(!runningJob.isComplete()) {
> runningJob = jobClient.getJob(runningJob.getJobID());
> }
> FileUtils.deleteQuietly(inputDir);
> FileUtils.deleteQuietly(outputDir);
> }
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira