[jira] Commented: (HADOOP-158) dfs should allocate a random blockid range to a file, then assign ids sequentially to blocks in the file

Sameer Paranjpye (JIRA) Tue, 30 May 2006 14:27:23 -0700

    [ 
http://issues.apache.org/jira/browse/HADOOP-158?page=comments#action_12413909 ]


Sameer Paranjpye commented on HADOOP-158:
-----------------------------------------

It can be sequential. In that case, the namenode would need to determine the 
lowest unused file-id at startup and start file-id assignments from that point. 

Even sequential allocation of file-ids should probably do the collision check 
because you don't need a trillion files in the system before you wrap around, 
you only need a trillion file creation events. If you're doing the collision 
check in both schemes the random file-id assignment keeps things simpler.

The possibility of collision with sequential assignment of file-ids is very 
remote, but why expose ourselves? I'm probably being paranoid so ignore me on 
this one if you want.





> dfs should allocate a random blockid range to a file, then assign ids 
> sequentially to blocks in the file
> --------------------------------------------------------------------------------------------------------
>
>          Key: HADOOP-158
>          URL: http://issues.apache.org/jira/browse/HADOOP-158
>      Project: Hadoop
>         Type: Bug

>   Components: dfs
>     Versions: 0.1.0
>     Reporter: Doug Cutting
>     Assignee: Konstantin Shvachko
>      Fix For: 0.4

>
> A random number generator is used to allocate block ids in dfs.  Sometimes a 
> block id is allocated that is already used in the filesystem, which causes 
> filesystem corruption.
> A short-term fix for this is to simply check when allocating block ids 
> whether any file is already using the newly allocated id, and, if it is, 
> generate another one.  There can still be collisions in some rare conditions, 
> but these are harder to fix and will wait, since this simple fix will handle 
> the vast majority of collisions.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Commented: (HADOOP-158) dfs should allocate a random blockid range to a file, then assign ids sequentially to blocks in the file

Reply via email to