[ http://issues.apache.org/jira/browse/HADOOP-146?page=comments#action_12375011 ]
Doug Cutting commented on HADOOP-146: ------------------------------------- I'd vote for sequential allocation. It will take a *really* long time to cycle through all ids. Migration should not be expensive, since it just requires renaming block files, not copying them. The high-watermark block id can be logged with the block->name table. Here's one way to migrate: initially the high-water-mark id is zero. So all blocks in the name table are out-of-range, and hence need renaming. Renaming can be handled like other blockwork: the namenode can give datanodes rename commands. While a block is being renamed it must be kept in side tables, so that, e.g., requests to read files whose blocks are partially renamed can still be handled. > potential conflict in block id's, leading to data corruption > ------------------------------------------------------------ > > Key: HADOOP-146 > URL: http://issues.apache.org/jira/browse/HADOOP-146 > Project: Hadoop > Type: Bug > Components: dfs > Versions: 0.1.0, 0.1.1 > Reporter: Yoram Arnon > Assignee: Konstantin Shvachko > Fix For: 0.3 > > currently, block id's are generated randomly, and are not tested for > collisions with existing id's. > while ids are 64 bits, given enough time and a large enough FS, collisions > are expected. > when a collision occurs, a random subset of blocks with that id will be > removed as extra replicas, and the contents of that portion of the containing > file are one random version of the block. > to solve this one could check for id collision when creating a new block, > getting a new id in case of conflict. This approach requires the name node to > keep track of all existing block id's (rather than just the ones who have > reported in), and to identify old versions of a block id as in valid (in case > a data node dies, a file is deleted, then a block id is reused for a new > file). > Alternatively, one could simply use sequential block id's. Here the downsides > are: > 1. migration from an existing file system is hard, requiring compaction of > the entire FS > 2. once you cycle through 64 bits of id's (quite a few years at full blast), > you're in trouble again (or run occasional/background compaction) > 3. you must never lose the high watermark block id. > synchronized Block allocateBlock(UTF8 src) { > Block b = new Block(); > FileUnderConstruction v = (FileUnderConstruction) > pendingCreates.get(src); > v.add(b); > pendingCreateBlocks.add(b); > return b; > } > static Random r = new Random(); > /** > */ > public Block() { > this.blkid = r.nextLong(); > this.len = 0; > } -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
