Given that it's an MR output, my guess is it got moved out of the temporary
directory when the job "Committed" and then was removed as another pass. I'd
grep for the containing directory name in the audit logs to see where it got
moved to and how it was eventually deleted.

Would be great if someone wrote some tools that, given a block ID, tracked
the life of the file that contained it (including renames of containing
dirs, etc). Shouldn't be too difficult.

-Todd

On Thu, Nov 11, 2010 at 9:38 AM, David Rosenstrauch <dar...@darose.net>wrote:

> On 11/11/2010 12:31 PM, David Rosenstrauch wrote:
>
>> 2) Name node also says that it created the file:
>>
>> [r...@hdmaster hadoop-0.20]# grep 2010.11.10-21.05.29
>> hadoop-hadoop-namenode-hdmaster.log.2010-11-10 | grep
>> shard2/IntentTrait.state
>> 2010-11-10 21:42:28,442 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>> ugi=root,root,bin,daemon,sys,adm,disk,wheel ip=/<our data node>
>> cmd=create src=<our root
>>
>> dir>/2010.11.10-21.05.29/output/_temporary/_attempt_201010221550_0418_r_000001_0/shard2/IntentTrait.state
>> dst=null perm=root:supergroup:rw-r--r--
>> 2010-11-10 21:42:29,802 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>> NameSystem.allocateBlock: <our root
>>
>> dir>/2010.11.10-21.05.29/output/_temporary/_attempt_201010221550_0418_r_000001_0/shard2/IntentTrait.state.
>> blk_-4237880568969698703_13404582
>> 2010-11-10 21:42:30,360 INFO org.apache.hadoop.hdfs.StateChange:
>> Removing lease on file <our root
>>
>> dir>/2010.11.10-21.05.29/output/_temporary/_attempt_201010221550_0418_r_000001_0/shard2/IntentTrait.state
>> from client DFSClient_attempt_201010221550_0418_r_000001_0
>> 2010-11-10 21:42:30,360 INFO org.apache.hadoop.hdfs.StateChange: DIR*
>> NameSystem.completeFile: file <our root
>>
>> dir>/2010.11.10-21.05.29/output/_temporary/_attempt_201010221550_0418_r_000001_0/shard2/IntentTrait.state
>> is closed by DFSClient_attempt_201010221550_0418_r_000001_0
>>
>> 2a) Name node does *not* say in the logs anywhere that it deleted the
>> file. (i.e., the above 4 log lines are the only lines in the name node
>> log containing this file name)
>>
>
> Sorry, I stand corrected.  When I grep the name node logs for the block ID
> there are some additional lines:
>
> [r...@hdmaster hadoop-0.20]# grep 4237880568969698703_13404582
> hadoop-hadoop-namenode-hdmaster.log.2010-11-10
>
> 2010-11-10 21:42:29,802 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.allocateBlock: <our root
> dir>/2010.11.10-21.05.29/output/_temporary/_attempt_201010221550_0418_r_000001_0/shard2/IntentTrait.state.
> blk_-4237880568969698703_13404582
>  2010-11-10 21:42:30,355 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.addStoredBlock: blockMap updated: <our ip prefix>.169:50010 is
> added to blk_-4237880568969698703_13404582 size 326522
> 2010-11-10 21:42:30,356 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.addStoredBlock: blockMap updated: <our ip prefix>.173:50010 is
> added to blk_-4237880568969698703_13404582 size 326522
> 2010-11-10 21:42:30,357 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.addStoredBlock: blockMap updated: <our ip prefix>.176:50010 is
> added to blk_-4237880568969698703_13404582 size 326522
> 2010-11-10 21:42:33,585 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* ask
> <our ip prefix>.176:50010 to delete  blk_-4237880568969698703_13404582
> 2010-11-10 21:42:42,598 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* ask
> <our ip prefix>.173:50010 to delete  blk_719995703651674050_13393533
> blk_2844104197159113873_13393516 blk_6711472063024831893_13393527
> blk_240592081553699046_13393054 blk_6933038421638165347_13393232
> blk_-1288897671244436593_13393351 blk_1408176921409258101_13393152
> blk_-4237880568969698703_13404582 blk_-4188858742780299895_13393410
> blk_-2670318277937301225_13393450 blk_-1858397614006984730_13393480
> blk_6292330466651227068_13393505
> 2010-11-10 21:42:42,598 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* ask
> <our ip prefix>.169:50010 to delete  blk_719995703651674050_13393533
> blk_6711472063024831893_13393527 blk_-514463419649319075_13404582
> blk_-1980639884885154260_13404582 blk_240592081553699046_13393054
> blk_8520588884031249451_13393428 blk_-6620589068796293265_13393089
> blk_6933038421638165347_13393232 blk_-1288897671244436593_13393351
> blk_-3581082286710707012_13393556 blk_1408176921409258101_13393152
> blk_-1858397614006984730_13393480 blk_6292330466651227068_13393505
> blk_2844104197159113873_13393516 blk_-4237880568969698703_13404582
> blk_-4188858742780299895_13393410 blk_-2670318277937301225_13393450
>
>
> I'm still not clear why those deletes are happening though.  Will read
> through the logs again at those locations.  But if anyone's got any ideas,
> pls chime in.
>
> DR
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to