[ 
https://issues.apache.org/jira/browse/HDFS-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13616433#comment-13616433
 ] 

Suresh Srinivas commented on HDFS-4645:
---------------------------------------

I should have added more relevant links. I have added related jira HDFS-898. It 
describes the reason for using random block IDs and motivation for moving to 
sequential block IDs. I have this as a separate jira because the solution I am 
proposing is slightly different.

Unlike HDFS-898 where the approach was to find a large enough contiguous range 
in block ID space, in this jira, I just want to start with a starting ID and 
sequentially generate block IDs. For old installs, if a block ID already exists 
in block map, we just skip over that.

Sequential generation has lot of advantages. We have a large number of bits at 
our disposal that can be used for additional purposes in block ID. This is one 
thing we considered when we added federation feature. Add block pool ID into 
block ID itself, using few bits from the long. This could also come handy for 
identifying different categories of blocks in the future, by using few bits to 
tag a block as such. There is also advantage of segregating blocks generated 
before an epoch given the predictable order of generation.

All the new installs would use sequential blocks. Old installs over a period of 
time would also slowly get rid of old blocks with block IDs generated using 
previous scheme to new scheme as old data gets deleted.

bq. Will new block ids be allowed to be < 0?
I think we could reserve few block IDs say 0-65535 and start generating from 
65535. When it reaches some max, we could rollover to negative numbers. That is 
a decision that can be made in the future.
                
> Move from randomly generated block ID to sequentially generated block ID
> ------------------------------------------------------------------------
>
>                 Key: HDFS-4645
>                 URL: https://issues.apache.org/jira/browse/HDFS-4645
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 3.0.0
>            Reporter: Suresh Srinivas
>            Assignee: Suresh Srinivas
>
> Currently block IDs are randomly generated. This means there is no pattern to 
> block ID generation and no guarantees such as uniqueness of block ID for the 
> life time of the system can be made. I propose using SequentialNumber for 
> block ID generation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to