[
https://issues.apache.org/jira/browse/HDFS-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13616433#comment-13616433
]
Suresh Srinivas commented on HDFS-4645:
---------------------------------------
I should have added more relevant links. I have added related jira HDFS-898. It
describes the reason for using random block IDs and motivation for moving to
sequential block IDs. I have this as a separate jira because the solution I am
proposing is slightly different.
Unlike HDFS-898 where the approach was to find a large enough contiguous range
in block ID space, in this jira, I just want to start with a starting ID and
sequentially generate block IDs. For old installs, if a block ID already exists
in block map, we just skip over that.
Sequential generation has lot of advantages. We have a large number of bits at
our disposal that can be used for additional purposes in block ID. This is one
thing we considered when we added federation feature. Add block pool ID into
block ID itself, using few bits from the long. This could also come handy for
identifying different categories of blocks in the future, by using few bits to
tag a block as such. There is also advantage of segregating blocks generated
before an epoch given the predictable order of generation.
All the new installs would use sequential blocks. Old installs over a period of
time would also slowly get rid of old blocks with block IDs generated using
previous scheme to new scheme as old data gets deleted.
bq. Will new block ids be allowed to be < 0?
I think we could reserve few block IDs say 0-65535 and start generating from
65535. When it reaches some max, we could rollover to negative numbers. That is
a decision that can be made in the future.
> Move from randomly generated block ID to sequentially generated block ID
> ------------------------------------------------------------------------
>
> Key: HDFS-4645
> URL: https://issues.apache.org/jira/browse/HDFS-4645
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: namenode
> Affects Versions: 3.0.0
> Reporter: Suresh Srinivas
> Assignee: Suresh Srinivas
>
> Currently block IDs are randomly generated. This means there is no pattern to
> block ID generation and no guarantees such as uniqueness of block ID for the
> life time of the system can be made. I propose using SequentialNumber for
> block ID generation.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira