Hexiaoqiao commented on PR #478: URL: https://github.com/apache/curator/pull/478#issuecomment-1704329621
@kezhuw Thanks for your detailed comments. I totally agree that this issue is not very easy/simple to fix perfectly. But for my case, this improvement could fix it when try to reproduce. > Besides above, did you find this in production ? What is your use case ? Of course YES. The corresponding code snippet as following shows. https://github.com/apache/hadoop/blob/a6c2526c6c7ccfc4483f58695aedf0cb892b4c4d/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/ZKDelegationTokenSecretManager.java#L506-L516 I would like to give a brief explanation (which is dependent by Hadoop project). A. For HDFS RBF architecture[1], all Routers share the same states which is managed by zookeeper, and Token is one of these states. B. About token[2], one Router generate token using increment sequence number shared by all Routers and update corresponding znode value (/zkdtsm/ZKDTSMRoot/ZKDTSMSeqNumRoot), let's name it Z. C. Where token number will be million/day or more, and the version of znode Z will overflow only years after create(maybe less than three years if 5 million/day tokens generated.) At first, I want to fix it at Hadoop side, but it is not smoothy and could not fix the root cause. For this PR, my first thought is fix at curator side first, then try to improve it at both zookeeper and curator side as the solution you mentioned above. Any thoughts? Thanks. [1] https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs-rbf/HDFSRouterFederation.html [2] http://hortonworks.com/wp-content/uploads/2011/08/adding_security_to_apache_hadoop.pdf -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
