[ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774934#comment-16774934
 ] 

He Xiaoqiao commented on HDFS-14305:
------------------------------------

hi [~csun], I think this issue triggered only after HDFS-6440. Before that, it 
is work well in HA cluster with 2 NameNodes (based on branch-2.7). Check 
{{serialNo}} NO. scope and shows as following and no overlap between 2 
namenodes:
{quote}nnIndex=0: [0, 2147483647]
 nnIndex=1: [-2147483648, -1]
{quote}
HDFS-6440 used {{intRange}} + {{nnRangeStart}} replace {{nnIndex}}, and only 
distributed positive integer to different namenodes, but when initialize 
serialNo it could be negtive integer since invoke {{new 
SecureRandom().nextInt()}}, and cause serialno overlap between different 
namenodes in same namespace. In one words, the root cause is 
{{SecureRandom().nextInt()}}.
 I propose to use only positive integer as serialNo of BlockTokenSecretManager 
to avoid this issue. FYI.

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> ----------------------------------------------------------------------------------
>
>                 Key: HDFS-14305
>                 URL: https://issues.apache.org/jira/browse/HDFS-14305
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: security
>            Reporter: Chao Sun
>            Assignee: Chao Sun
>            Priority: Major
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
>     this.intRange = Integer.MAX_VALUE / numNNs;
>     this.nnRangeStart = intRange * nnIndex;
>     this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.<nameservice>}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to