[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421730#comment-13421730
 ] 

Andy Isaacson commented on MAPREDUCE-2374:
------------------------------------------

It's true that open(O_CLOEXEC) isn't in RHEL5, but fcntl to set FD_CLOEXEC is.  
It doesn't completely close the race but it does reduce the window 
substantially.  (The fact that there's an unfixable race window there is why 
O_CLOEXEC was added.)  It would still require NativeIO.  It's orthogonal to my 
proposed fix, though.

Colin's theory also clears up this worry:
bq. Now, suppose the undiscovered but hypothesized race condition in 
writeCommand does exist, and affects the write as well as the close.
Since there's no race condition in writeCommand, there's no chance the script 
will be incomplete when we run it.

I'd love to see a strace -tttf showing the entire failure scenario just to be 
sure, but I'm confident enough that Colin nailed it that I think we should just 
merge my "no more -c" patch.
                
> Should not use PrintWriter to write taskjvm.sh
> ----------------------------------------------
>
>                 Key: MAPREDUCE-2374
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2374
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.22.1
>
>         Attachments: failed_taskjvmsh.strace, mapreduce-2374-on-20sec.txt, 
> mapreduce-2374.txt, mapreduce-2374.txt, successfull_taskjvmsh.strace
>
>
> Our use of PrintWriter in TaskController.writeCommand is unsafe, since that 
> class swallows all IO exceptions. We're not currently checking for errors, 
> which I'm seeing result in occasional task failures with the message "Text 
> file busy" - assumedly because the close() call is failing silently for some 
> reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to