[ 
https://issues.apache.org/jira/browse/HADOOP-1612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515217
 ] 

Arun C Murthy commented on HADOOP-1612:
---------------------------------------

Christian, I've spent a fair amount of time trying to reproduce the _lost 
files_ case without much headway... although the issue with  the _${taskid} 
subdirs turning up is fairly easy to reproduce and as we discussed it is an 
unfortunate side-effect of speculative tasks killed *after* job completion. 

I'll keep trying to see if I get some toe-hold, meanwhile:
a) Could you just ignore the the _${taskid} files while you are moving stuff 
from your job output dir.
b) Try and incorporate HADOOP-1576, which at the very least, helps in debugging.

Thanks!


> listing of an output directory shortly after job completion fails
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1612
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1612
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.14.0
>            Reporter: Christian Kunz
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.14.0
>
>
> Sometimes, after a job finishes, and another application wants to rename dfs 
> files created by that job, listing of the output directory containing the 
> newly created files fails. File creation and directory listing is done via 
> libhdfs, but it is unlikely that this makes any difference, therefore, I add 
> this to the mapred component.
> It might be a race condition: does the job complete before the files in the 
> output directory are promoted?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to