[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422518#comment-13422518
 ] 

Shrinivas Joshi commented on MAPREDUCE-2374:
--------------------------------------------

No luck with strace output yet. I will try to collect a failing case using 
short running jobs such as piEstimator. 

While I was initially debugging this issues, since I was not able to see any 
process in lsof output of taskjvm.sh file, I was wondering whether ext4 delayed 
allocation had to do something with the text file busy error. I thought that 
this would be more of an issue with the -c bash switch change if the whole file 
doesn't gets committed to the disk by the time executor thread is scheduled. 
This theory could be wrong though as indicated by Todd and Andy. 

I had to remove the BufferedOutputStream wrapper so that fdos.sync call would 
succeed as LocalFSFileOutputStream implements Syncable interface whereas 
BufferedOutputStream  doesnt. I thought PrintWriter close wouldn't affect the 
flush and sync of LocalFSFileOutputStream. May be I am missing something?
                
> Should not use PrintWriter to write taskjvm.sh
> ----------------------------------------------
>
>                 Key: MAPREDUCE-2374
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2374
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.22.1
>
>         Attachments: failed_taskjvmsh.strace, mapreduce-2374-branch-1.patch, 
> mapreduce-2374-on-20sec.txt, mapreduce-2374.txt, mapreduce-2374.txt, 
> successfull_taskjvmsh.strace
>
>
> Our use of PrintWriter in TaskController.writeCommand is unsafe, since that 
> class swallows all IO exceptions. We're not currently checking for errors, 
> which I'm seeing result in occasional task failures with the message "Text 
> file busy" - assumedly because the close() call is failing silently for some 
> reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to