Paul Baclace wrote:
Doug Cutting expressed a concern to me about using util.Random to generate
random 64 bit block numbers for NDFS. The following is my analysis.
Nice stuff, Paul. Thanks.
It just occurred to me that perhaps we could simply use sequential block
numbering. All block ids are generated centrally on the namenode. If
it can simply maintain a persistent counter then sequential allocation
could be used. Since each block addition is logged, this is easy to
persist. When re-playing the log on namenode startup we simply keep
track of the highest known block id.
Blocks are not logged until the file is closed, so there could be a
problem on restart if datanodes report blocks for files that were never
closed. These would collide with yet-unallocated block numbers,
potentially corrupting the filesystem. I'm not sure what the best way
to handle that would be... Ideas?
Doug