Hi, everyone We are making our best effort to improve Hadoop Distributed Filesystem(HDFS). Now, our plan is implementing single client appending and truncating based on hadoop 0.15.0 version.
Our implementation will follow the desigh of Hadoop-1700. In the latest version of Hadoop-1700, we found some points (or bugs) to be corrected/improved. Here are the points: # 0. We recommend: the BlockID don't adopt random 64bit ID. We give a global variable NextBlockID. The NextBlockID is 64bit, initialized to 0, is recorded into transaction log. When we need a new BlockID, we give NextBlockID as the new BlockID, and the NextBlockID is incremented by 1. The new NextBlockID is recorded into transaction log. #1. In section _The Writer_ of Hadoop-1700, the original text is: ____________________________________________________________________________________________________________________________ .......... The Writer requests the Namenode to create a new file or open an existing file with an intention of appending to it. The Namenode generates a new blockId and a new GenerationStamp for this block. Let's call the GenerationStamp that is associated with a block as BlockGenerationStamp. A new BlockGenerationStamp is generated by incrementing the global GenerationStamp by one and storing the global GenerationStamp back into the transaction log. It records the blockId, block locations and the BlockGenerationStamp in the BlocksMap. The Namenode returns the blockId, BlockGenerationStamp and block locations to the Client. ......... _______________________________________________________________________________________________________________________________ our comment: The Writer requests the Namenode to create a new file or open an existing file with an intention of appending to it. The Namenode generates a new blockId and a new GenerationStamp for this block. Let's call the GenerationStamp that is associated with a block as BlockGenerationStamp. A new BlockGenerationStamp is generated by incrementing the global GenerationStamp by one and storing the global GenerationStamp back into the transaction log. ____If the block is a new block, the Namenode returns the blockID, BlockGenerationStamp and block locations to the client. The namenode don't records the blockID, BlockLocations, BlockGenerationStamp in BlocksMap. If the block is an existing block (the last block is not full), we generate a new BlockGenerationStamp and return the BlockID, BlockLocations, Old BlockGenerationStamp, new BlockGenerationStamp to the client, the namenode don't records the new BlockGenerationStamp._____ When client successfully updates the all datanodes where the replicas of block is located in, the client tells namenode that he has successfully updated the datanodes, along with the information (BlockID, BlockLocations, new BlockGenerationStamp). Then, Namenode records the blockId, block locations and the new BlockGenerationStamp in the BlocksMap, and creates the OpenFile transaction log. ____________________________________________________________________________________________________________________________- * notes: our method can tolerate the following failure. If namenode records the new BlockGenerationStamp in BlocksMap, when client don't update the datanodes or client (writer) crashes, namenode will start lease recovery. At this moment, replica version may be smaller than the version recorded in namenode. So, namenode will discard the replicas. It is not our expected result. Our method can tolerate this failure. 2. In hadoop-1700, OpenFile transaction log records all the blocklist. If client invokes flush operation frenquently, maybe the overhead of namenode is very heavy. So we adopt the following method: we only record the varying block, don't record the blocks which are not being modified. If a new block is created, we record the blockID, BlockGenerationStamp. If the BLockGenerationStamp of a block is modified, we only record the new BlockGenerationStamp for the block. By this method, we reduce the overhead of namenode when namenode creates the OpenFile transaction log. We look forward to getting help from Hadoop Community. Your any advice will be appreciated. Best regards, Ruyue Ma [EMAIL PROTECTED] Beijing, China