I won't be able to test this for a few days, depending on how 0.12.3
testing goes.
On Apr 4, 2007, at 1:23 PM, Tom White (JIRA) wrote:
[ https://issues.apache.org/jira/browse/HADOOP-1127?
page=com.atlassian.jira.plugin.system.issuetabpanels:comment-
tabpanel#action_12486762 ]
Tom White commented on HADOOP-1127:
-----------------------------------
Thanks Arun.
Nigel, can you confirm your tests don't hang now? Thanks.
Speculative Execution and output of Reduce tasks
------------------------------------------------
Key: HADOOP-1127
URL: https://issues.apache.org/jira/browse/
HADOOP-1127
Project: Hadoop
Issue Type: Improvement
Components: mapred
Affects Versions: 0.12.0
Reporter: Arun C Murthy
Assigned To: Arun C Murthy
Fix For: 0.13.0
Attachments: HADOOP-1127_20070328_1.patch,
HADOOP-1127_20070331_2.patch, HADOOP-1127_20070402_3.patch,
HADOOP-1127_20070403_4.patch, HADOOP-1127_20070405_5.patch
We've recently seen instances where jobs run with 'speculative
execution' tend to be quite unstable and fail with
*AlreadyBeingCreatedException* noticed at the NameNode. Also
potentially we could have hairy situations where a failed Reduce
tasks's output could clash with a successful task's (same tip)
output.
As it exists, speculative execution relies on the PhasedFileSystem
which creates a temp output file and then on task-completion that
file is 'moved' to its final position via a call to
PhasedFileSystem.commit from ReduceTask.run(). This has lead to
issues such as the above.
Proposal:
Basically the idea is to due this uniformly for all Reduce tasks
i.e. all reducers create temp files and then have a serialized
'commit' done by the JobTracker which moves the temp file to it's
final position.
We create the temp file in the job's output directory itself:
<output_dir>/_<taskid> (emphasis on the leading '_')
On task completion we'll add that temp file's path to the
TaskStatus and then the JobTracker moves that file to it's final
position.
Thoughts?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.