[
https://issues.apache.org/jira/browse/MAPREDUCE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422572#comment-13422572
]
Todd Lipcon commented on MAPREDUCE-2374:
----------------------------------------
You've got the layering mixed up there -- execve is perfectly happy to read out
of buffer cache regardless of whether the underlying file is on persistent
media or not. So, you don't need to sync anything.
The java close() API does forward the 'close' through to the underlying
writers, and that also triggers a flush of any buffered streams.
So I think Andy's patch is sufficient to fix this issue.
Andy: would you mind updating the patch with a comment above your change
indicating that it's important to run "bash foo.sh" instead of just chmod 755
and execing "foo.sh" or running "bash -c foo.sh", due to this race condition?
It's subtle and I can imagine it regressing if someone later redoes this code
at all.
Also, please double-check that the LinuxTaskExecutor code path doesn't have the
same problem, if you can. We'll also need to look at the equivalent code in
trunk (MR2), since it probably has the same issue.
Nice work, everyone, tracking down this long-standing bug.
> Should not use PrintWriter to write taskjvm.sh
> ----------------------------------------------
>
> Key: MAPREDUCE-2374
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2374
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 0.22.0
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Fix For: 0.22.1
>
> Attachments: failed_taskjvmsh.strace, mapreduce-2374-branch-1.patch,
> mapreduce-2374-on-20sec.txt, mapreduce-2374.txt, mapreduce-2374.txt,
> successfull_taskjvmsh.strace
>
>
> Our use of PrintWriter in TaskController.writeCommand is unsafe, since that
> class swallows all IO exceptions. We're not currently checking for errors,
> which I'm seeing result in occasional task failures with the message "Text
> file busy" - assumedly because the close() call is failing silently for some
> reason.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira