[ http://issues.apache.org/jira/browse/HADOOP-146?page=all ]
     
Doug Cutting closed HADOOP-146:
-------------------------------


> potential conflict in block id's, leading to data corruption
> ------------------------------------------------------------
>
>          Key: HADOOP-146
>          URL: http://issues.apache.org/jira/browse/HADOOP-146
>      Project: Hadoop
>         Type: Bug

>   Components: dfs
>     Versions: 0.1.0, 0.1.1
>     Reporter: Yoram Arnon
>     Assignee: Konstantin Shvachko
>      Fix For: 0.3.0
>  Attachments: hadoop-146-random.patch
>
> currently, block id's are generated randomly, and are not tested for 
> collisions with existing id's.
> while ids are 64 bits, given enough time and a large enough FS, collisions 
> are expected.
> when a collision occurs, a random subset of blocks with that id will be 
> removed as extra replicas, and the contents of that portion of the containing 
> file are one random version of the block.
> to solve this one could check for id collision when creating a new block, 
> getting a new id in case of conflict. This approach requires the name node to 
> keep track of all existing block id's (rather than just the ones who have 
> reported in), and to identify old versions of a block id as in valid (in case 
> a data node dies, a file is deleted, then a block id is reused for a new 
> file).
> Alternatively, one could simply use sequential block id's. Here the downsides 
> are: 
> 1. migration from an existing file system is hard, requiring compaction of 
> the entire FS
> 2. once you cycle through 64 bits of id's (quite a few years at full blast), 
> you're in trouble again (or run occasional/background compaction)
> 3. you must never lose the high watermark block id.
> synchronized Block allocateBlock(UTF8 src) {
>         Block b = new Block();
>         FileUnderConstruction v = (FileUnderConstruction) 
> pendingCreates.get(src);
>         v.add(b);
>         pendingCreateBlocks.add(b);
>         return b;
>     }
> static Random r = new Random();
>     /**
>      */
>     public Block() {
>         this.blkid = r.nextLong();
>         this.len = 0;
>     }

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to