[ https://issues.apache.org/jira/browse/HADOOP-1612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515217 ]
Arun C Murthy commented on HADOOP-1612: --------------------------------------- Christian, I've spent a fair amount of time trying to reproduce the _lost files_ case without much headway... although the issue with the _${taskid} subdirs turning up is fairly easy to reproduce and as we discussed it is an unfortunate side-effect of speculative tasks killed *after* job completion. I'll keep trying to see if I get some toe-hold, meanwhile: a) Could you just ignore the the _${taskid} files while you are moving stuff from your job output dir. b) Try and incorporate HADOOP-1576, which at the very least, helps in debugging. Thanks! > listing of an output directory shortly after job completion fails > ----------------------------------------------------------------- > > Key: HADOOP-1612 > URL: https://issues.apache.org/jira/browse/HADOOP-1612 > Project: Hadoop > Issue Type: Bug > Components: mapred > Affects Versions: 0.14.0 > Reporter: Christian Kunz > Assignee: Arun C Murthy > Priority: Blocker > Fix For: 0.14.0 > > > Sometimes, after a job finishes, and another application wants to rename dfs > files created by that job, listing of the output directory containing the > newly created files fails. File creation and directory listing is done via > libhdfs, but it is unlikely that this makes any difference, therefore, I add > this to the mapred component. > It might be a race condition: does the job complete before the files in the > output directory are promoted? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.