The "audit log" is a set of entries in the normal NN log by default. It can be separated out into another log file if preferred, though it's not by default.
Try grepping for "cmd=" in your NN log, you'll see what we're talking about. -Todd On Thu, Nov 18, 2010 at 7:24 AM, David Rosenstrauch <dar...@darose.net>wrote: > Sorry, just first had a moment to get back to this now. > > ??? Never heard of the "audit log" before, and I don't see any file named > like that in the hadoop logs directory on the name node. Can you elaborate > what the audit log is and/or where it might be found? > > Thanks, > > DR > > > On 11/11/2010 07:14 PM, Todd Lipcon wrote: > >> What's the last audit log entry prior to 2010-11-10 21:42:33,389? >> >> -Todd >> >> On Thu, Nov 11, 2010 at 2:10 PM, David Rosenstrauch<dar...@darose.net >> >wrote: >> >> Saw a couple more references to that block before the "to delete blk" >>> messages: >>> >>> 2010-11-10 21:42:33,389 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* >>> NameSystem.addToInvalidates: blk_-4237880568969698703 is added to >>> invalidSet >>> of<our ip prefix>.169:50010 >>> 2010-11-10 21:42:33,389 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* >>> NameSystem.addToInvalidates: blk_-4237880568969698703 is added to >>> invalidSet >>> of<our ip prefix>.173:50010 >>> 2010-11-10 21:42:33,389 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* >>> NameSystem.addToInvalidates: blk_-4237880568969698703 is added to >>> invalidSet >>> of<our ip prefix>.176:50010 >>> >>> Again, I'm not sure why this is happening though. >>> >>> >>> BTW, I appreciate your comments below (about it getting moved out of the >>> temp directory and then getting removed in another pass). But I grepped >>> the >>> logs as you suggested, and I still don't see how it got moved/deleted: >>> >>> [r...@hdmaster hadoop-0.20]# grep _attempt_201010221550_0418_r_000001_0 >>> hadoop-hadoop-namenode-hdmaster.log.2010-11-10 >>> 2010-11-10 21:42:28,442 INFO >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: >>> ugi=root,root,bin,daemon,sys,adm,disk,wheel ip=/<our ip root>.176 >>> cmd=create src=<our root >>> >>> dir>/2010.11.10-21.05.29/output/_temporary/_attempt_201010221550_0418_r_000001_0/shard2/IntentTrait.state >>> dst=null perm=root:supergroup:rw-r--r-- >>> >>> 2010-11-10 21:42:29,802 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* >>> NameSystem.allocateBlock:<our root >>> >>> dir>/2010.11.10-21.05.29/output/_temporary/_attempt_201010221550_0418_r_000001_0/shard2/IntentTrait.state. >>> blk_-4237880568969698703_13404582 >>> 2010-11-10 21:42:30,360 INFO org.apache.hadoop.hdfs.StateChange: Removing >>> lease on file<our root >>> >>> dir>/2010.11.10-21.05.29/output/_temporary/_attempt_201010221550_0418_r_000001_0/shard2/IntentTrait.state >>> from client DFSClient_attempt_201010221550_0418_r_000001_0 >>> 2010-11-10 21:42:30,360 INFO org.apache.hadoop.hdfs.StateChange: DIR* >>> NameSystem.completeFile: file<our root >>> >>> dir>/2010.11.10-21.05.29/output/_temporary/_attempt_201010221550_0418_r_000001_0/shard2/IntentTrait.state >>> is closed by DFSClient_attempt_201010221550_0418_r_000001_0 >>> >>> >>> I have to say, I'm really quite stumped. Been looking at this all >>> afternoon, and I still have no idea how/why that file got purged. :-( >>> >>> Thanks, >>> >>> DR >>> >>> >>> On 11/11/2010 02:13 PM, Todd Lipcon wrote: >>> >>> Given that it's an MR output, my guess is it got moved out of the >>>> temporary >>>> directory when the job "Committed" and then was removed as another pass. >>>> I'd >>>> grep for the containing directory name in the audit logs to see where it >>>> got >>>> moved to and how it was eventually deleted. >>>> >>>> Would be great if someone wrote some tools that, given a block ID, >>>> tracked >>>> the life of the file that contained it (including renames of containing >>>> dirs, etc). Shouldn't be too difficult. >>>> >>>> -Todd >>>> >>>> On Thu, Nov 11, 2010 at 9:38 AM, David Rosenstrauch<dar...@darose.net >>>> >>>>> wrote: >>>>> >>>> >>>> Sorry, I stand corrected. When I grep the name node logs for the block >>>>> ID >>>>> >>>>> there are some additional lines: >>>>> >>>>> [r...@hdmaster hadoop-0.20]# grep 4237880568969698703_13404582 >>>>> hadoop-hadoop-namenode-hdmaster.log.2010-11-10 >>>>> >>>>> 2010-11-10 21:42:29,802 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* >>>>> NameSystem.allocateBlock:<our root >>>>> >>>>> >>>>> dir>/2010.11.10-21.05.29/output/_temporary/_attempt_201010221550_0418_r_000001_0/shard2/IntentTrait.state. >>>>> blk_-4237880568969698703_13404582 >>>>> 2010-11-10 21:42:30,355 INFO org.apache.hadoop.hdfs.StateChange: >>>>> BLOCK* >>>>> NameSystem.addStoredBlock: blockMap updated:<our ip prefix>.169:50010 >>>>> is >>>>> added to blk_-4237880568969698703_13404582 size 326522 >>>>> 2010-11-10 21:42:30,356 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* >>>>> NameSystem.addStoredBlock: blockMap updated:<our ip prefix>.173:50010 >>>>> is >>>>> added to blk_-4237880568969698703_13404582 size 326522 >>>>> 2010-11-10 21:42:30,357 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* >>>>> NameSystem.addStoredBlock: blockMap updated:<our ip prefix>.176:50010 >>>>> is >>>>> added to blk_-4237880568969698703_13404582 size 326522 >>>>> 2010-11-10 21:42:33,585 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* >>>>> ask >>>>> <our ip prefix>.176:50010 to delete blk_-4237880568969698703_13404582 >>>>> 2010-11-10 21:42:42,598 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* >>>>> ask >>>>> <our ip prefix>.173:50010 to delete blk_719995703651674050_13393533 >>>>> blk_2844104197159113873_13393516 blk_6711472063024831893_13393527 >>>>> blk_240592081553699046_13393054 blk_6933038421638165347_13393232 >>>>> blk_-1288897671244436593_13393351 blk_1408176921409258101_13393152 >>>>> blk_-4237880568969698703_13404582 blk_-4188858742780299895_13393410 >>>>> blk_-2670318277937301225_13393450 blk_-1858397614006984730_13393480 >>>>> blk_6292330466651227068_13393505 >>>>> 2010-11-10 21:42:42,598 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* >>>>> ask >>>>> <our ip prefix>.169:50010 to delete blk_719995703651674050_13393533 >>>>> blk_6711472063024831893_13393527 blk_-514463419649319075_13404582 >>>>> blk_-1980639884885154260_13404582 blk_240592081553699046_13393054 >>>>> blk_8520588884031249451_13393428 blk_-6620589068796293265_13393089 >>>>> blk_6933038421638165347_13393232 blk_-1288897671244436593_13393351 >>>>> blk_-3581082286710707012_13393556 blk_1408176921409258101_13393152 >>>>> blk_-1858397614006984730_13393480 blk_6292330466651227068_13393505 >>>>> blk_2844104197159113873_13393516 blk_-4237880568969698703_13404582 >>>>> blk_-4188858742780299895_13393410 blk_-2670318277937301225_13393450 >>>>> >>>>> >>>>> I'm still not clear why those deletes are happening though. Will read >>>>> through the logs again at those locations. But if anyone's got any >>>>> ideas, >>>>> pls chime in. >>>>> >>>>> DR >>>>> >>>>> >>>> >> >> > -- Todd Lipcon Software Engineer, Cloudera