[ 
https://issues.apache.org/jira/browse/HDFS-15756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17257324#comment-17257324
 ] 

Xiaoqiao He commented on HDFS-15756:
------------------------------------

Thanks [~hbprotoss] for your report, I believe it is possible that there are 
data gap between different nodes for a short period of time when using 
zookeeper cluster to store token for routers. Reference [zookeeper Consistency 
Guarantees|https://zookeeper.apache.org/doc/r3.6.2/zookeeperProgrammers.html#ch_zkGuarantees].
 This case appear very rarely in my practice.(Sorry we do not involve 
Spark2.4). I agree that we should enrich the implementation of 
{{AbstractDelegationTokenSecretManager}} using RDBMS/KV or others.
Just out of interest, is there any impact when RM renew token failed when 
receive application request? IIUC, it could be ignore because after 20h (by 
default) before token expire RM will trigger another renew request, then token 
must synced and the renew operation will be successful. (Of course, some corner 
case could meet if tune token parameter)
For the solution, I do not think it is good to throw StandbyException to client 
from Router. Considering some one request to renew a invalid/expired token, 
which is neither at memory nor at zookeeper store, then it will be failover for 
many times by default is it necessary? Thanks, welcome more comments.
cc [~fengnanli],[~csun]

> RBF: Cannot get updated delegation token from zookeeper
> -------------------------------------------------------
>
>                 Key: HDFS-15756
>                 URL: https://issues.apache.org/jira/browse/HDFS-15756
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: rbf
>    Affects Versions: 3.0.0
>            Reporter: hbprotoss
>            Priority: Major
>
> Affected version: all version with rbf
> When RBF work with spark 2.4 client mode, there will be a chance that token 
> is missing across different nodes in RBF cluster. The root cause is that 
> spark renew the  token(via resource manager) immediately after got one, as 
> zookeeper don't have a strong consistency guarantee after an update in 
> cluster, zookeeper client may read a stale value in some followers not synced 
> with other nodes.
>  
> We apply a patch in spark, but it is still the problem of RBF. Is it possible 
> for RBF to replace the delegation token store using some other 
> datasource(redis for example)?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to