[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421198#comment-13421198
 ] 

Colin Patrick McCabe commented on MAPREDUCE-2374:
-------------------------------------------------

I just thought of something.  Suppose that the JVM is holding blahblahblah.sh 
open for write, and meanwhile another thread forks a bash process (or 
something).  After the fork completes, that process will hold blahblahblah.sh 
open for write with O_WRONLY.  At the very least, this is a race condition that 
could lead to "mysterious" failures, since you don't know when the fork'ed 
process will next get scheduled in relation to the parent process.

The O_CLOEXEC flag was introduced in Linux 2.6.23 to solve this problem, by 
atomically closing the FDs on a fork.  However, I didn't see it being used in 
the strace output you posted.  And it's certainly not around on RHEL5 and 
earlier.

If this is true, then I guess the solution Andy posted earlier is probably the 
best way to go.  Just get rid of the -c and this behavior will be masked.
                
> Should not use PrintWriter to write taskjvm.sh
> ----------------------------------------------
>
>                 Key: MAPREDUCE-2374
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2374
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.22.1
>
>         Attachments: failed_taskjvmsh.strace, mapreduce-2374-on-20sec.txt, 
> mapreduce-2374.txt, mapreduce-2374.txt, successfull_taskjvmsh.strace
>
>
> Our use of PrintWriter in TaskController.writeCommand is unsafe, since that 
> class swallows all IO exceptions. We're not currently checking for errors, 
> which I'm seeing result in occasional task failures with the message "Text 
> file busy" - assumedly because the close() call is failing silently for some 
> reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to