[
https://issues.apache.org/jira/browse/HADOOP-2391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12550297
]
Devaraj Das commented on HADOOP-2391:
-------------------------------------
bq. Another, perhaps better solutions would simply be to store the temporary
task output somewhere else NOT beneath the output directory
+1
> Speculative Execution race condition with output paths
> ------------------------------------------------------
>
> Key: HADOOP-2391
> URL: https://issues.apache.org/jira/browse/HADOOP-2391
> Project: Hadoop
> Issue Type: Bug
> Environment: all
> Reporter: Dennis Kubes
> Assignee: Dennis Kubes
>
> I am tracking a problem where when speculative execution is enabled, there is
> a race condition when trying to read output paths from a previously completed
> job. More specifically when reduce tasks run their output is put into a
> working directory under the task name until the task in completed. The
> directory name is something like workdir/_taskid. Upon completion the output
> get moved into workdir. Regular tasks are checked for this move and not
> considered completed until this move is made. I have not verified it but all
> indications point to speculative tasks NOT having this same check for
> completion and more importantly removal when killed. So what we end up with
> when trying to read the output of previous tasks with speculative execution
> enabled is the possibility that previous workdir/_taskid will be present when
> the output directory is read by a chained job. Here is an error when
> supports my theory:
> Generator: org.apache.hadoop.ipc.RemoteException: java.io.IOException: Cannot
> open filename
> /u01/hadoop/mapred/temp/generate-temp-1197104928603/_task_200712080949_0005_r_000014_1
> at org.apache.hadoop.dfs.NameNode.open(NameNode.java:234)
> at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:389)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:644)
> at org.apache.hadoop.ipc.Client.call(Client.java:507)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:186)
> at org.apache.hadoop.dfs.$Proxy0.open(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
> at org.apache.hadoop.dfs.$Proxy0.open(Unknown Source)
> at
> org.apache.hadoop.dfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:839)
> at
> org.apache.hadoop.dfs.DFSClient$DFSInputStream.<init>(DFSClient.java:831)
> at org.apache.hadoop.dfs.DFSClient.open(DFSClient.java:263)
> at
> org.apache.hadoop.dfs.DistributedFileSystem.open(DistributedFileSystem.java:114)
> at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1356)
> at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1349)
> at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1344)
> at
> org.apache.hadoop.mapred.SequenceFileOutputFormat.getReaders(SequenceFileOutputFormat.java:87)
> at org.apache.nutch.crawl.Generator.generate(Generator.java:429)
> at org.apache.nutch.crawl.Generator.run(Generator.java:563)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:54)
> at org.apache.nutch.crawl.Generator.main(Generator.java:526)
> I will continue to research this and post as I make progress on tracking down
> this bug.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.