[
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16940652#comment-16940652
]
Xiaoqiao He commented on HDFS-14305:
------------------------------------
Thanks [~shv],[~arp] very much for your feedback and works. it seems that all
works(include [^HDFS-14305.006.patch] ) we did is just reducing conflict
probability rather than avoid it completely.
{quote}If you start the NNs in arbitrary order, you can get block token
collisions because the ranges will change in 3.2.1 compared to 3.2.0.{quote}
This case seems not eliminate with/without [^HDFS-14305-007.patch] changes.
Please correct me if something wrong. Thanks [~shv] again.
> Serial number in BlockTokenSecretManager could overlap between different
> namenodes
> ----------------------------------------------------------------------------------
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode, security
> Reporter: Chao Sun
> Assignee: Xiaoqiao He
> Priority: Major
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14305-007.patch, HDFS-14305.001.patch,
> HDFS-14305.002.patch, HDFS-14305.003.patch, HDFS-14305.004.patch,
> HDFS-14305.005.patch, HDFS-14305.006.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and
> {{nnIndex}} is the index of the current NameNode specified in the
> configuration {{dfs.ha.namenodes.<nameservice>}}.
> However, with this approach, different NameNode could have overlapping ranges
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100,
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which
> will cause clients to fail because of {{InvalidToken}} error.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]