[
https://issues.apache.org/jira/browse/MAPREDUCE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13680022#comment-13680022
]
Ravi Prakash commented on MAPREDUCE-5317:
-----------------------------------------
Its quite trivial to reproduce this:
hadoop jar
$HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar
randomtextwriter -Dmapreduce.randomwriter.totalbytes=1000
-Dmapreduce.randomwriter.bytespermap=1000 /someDirectory/run1
$ hdfs dfs -ls -R /someDirectory
drwxr-xr-x - someUser supergroup 0 2013-06-10 16:46 /someDirectory
drwxr-xr-x - someUser supergroup 0 2013-06-10 16:47
/someDirectory/run1
drwxr-xr-x - someUser supergroup 0 2013-06-10 16:47
/someDirectory/run1/_temporary
drwxr-xr-x - someUser supergroup 0 2013-06-10 16:47
/someDirectory/run1/_temporary/1
drwxr-xr-x - someUser supergroup 0 2013-06-10 16:47
/someDirectory/run1/_temporary/1/_temporary
In the namenode logs I see:
2013-06-10 16:47:17,392 [IPC Server handler 2 on 9000] DEBUG
org.apache.hadoop.hdfs.StateChange: *DIR* Namenode.delete:
src=/someDirectory/run1/_temporary, recursive=true
2013-06-10 16:47:17,392 [IPC Server handler 2 on 9000] DEBUG
org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.delete:
/someDirectory/run1/_temporary
2013-06-10 16:47:17,393 [IPC Server handler 2 on 9000] DEBUG
org.apache.hadoop.hdfs.StateChange: DIR* FSDirectory.delete:
/someDirectory/run1/_temporary
2013-06-10 16:47:17,393 [IPC Server handler 2 on 9000] DEBUG
org.apache.hadoop.hdfs.StateChange: DIR* FSDirectory.unprotectedDelete:
/someDirectory/run1/_temporary is removed
2013-06-10 16:47:17,393 [IPC Server handler 2 on 9000] DEBUG
org.apache.hadoop.hdfs.StateChange: DIR* Namesystem.delete:
/someDirectory/run1/_temporary is removed
....
.....
2013-06-10 16:47:20,709 [IPC Server handler 5 on 9000] DEBUG
org.apache.hadoop.hdfs.StateChange: *DIR* NameNode.create: file
/someDirectory/run1/_temporary/1/_temporary/attempt_1370900756164_0001_m_000005_2/part-m-00005
for DFSClient_attempt_1370900756164_0001_m_000005_2_-2017431827_1 at <SOMEIP>
2013-06-10 16:47:20,709 [IPC Server handler 5 on 9000] DEBUG
org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.startFile:
src=/someDirectory/run1/_temporary/1/_temporary/attempt_1370900756164_0001_m_000005_2/part-m-00005,
holder=DFSClient_attempt_1370900756164_0001_m_000005_2_-2017431827_1,
clientMachine=<SOMEIP>, createParent=true, replication=1, createFlag=[CREATE,
OVERWRITE]
2013-06-10 16:47:20,710 [IPC Server handler 5 on 9000] DEBUG
org.apache.hadoop.hdfs.StateChange: DIR* FSDirectory.mkdirs: created directory
/someDirectory/run1/_temporary
2013-06-10 16:47:20,710 [IPC Server handler 5 on 9000] DEBUG
org.apache.hadoop.hdfs.StateChange: DIR* FSDirectory.mkdirs: created directory
/someDirectory/run1/_temporary/1
2013-06-10 16:47:20,710 [IPC Server handler 5 on 9000] DEBUG
org.apache.hadoop.hdfs.StateChange: DIR* FSDirectory.mkdirs: created directory
/someDirectory/run1/_temporary/1/_temporary
> Stale files left behind for failed jobs
> ---------------------------------------
>
> Key: MAPREDUCE-5317
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5317
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mrv2
> Affects Versions: 3.0.0, 2.0.4-alpha, 0.23.8
> Reporter: Ravi Prakash
> Assignee: Ravi Prakash
>
> Courtesy [~amar_kamat]!
> {quote}
> We are seeing _temporary files left behind in the output folder if the job
> fails.
> The job were failed due to hitting quota issue.
> I simply ran the randomwriter (from hadoop examples) with the default setting.
> That failed and left behind some stray files.
> {quote}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira