[ https://issues.apache.org/jira/browse/HDFS-17421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chingachgook updated HDFS-17421: -------------------------------- Issue Type: Bug (was: Test) > Check the correctness of the calculation RpcAuthentication* > ----------------------------------------------------------- > > Key: HDFS-17421 > URL: https://issues.apache.org/jira/browse/HDFS-17421 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, metrics > Reporter: Chingachgook > Priority: Major > > Hi > There is a question about two hdfs metrics that arose as a result of my > attempts to calculate the load on the KDC for an industrial cluster > There are two parameters in hdfs metrics > RpcAuthenticationSuccesses - Total number of successful authentication > attempts > RpcAuthenticationFailures - Total number of authentication failures > I expect that any data request in the hadoop cluster will commit > the request to KDC -> get ticket, > the request to the NameNode > after which the request counter should activate either +1 to the metric if > successful, or +1 to the metric if unsuccessful > However, in a test cluster where I have > 4 DataNodes and 2 NameNodes (HA), I see completely incomprehensible > indicators for these metrics. > By the way, at the same time, I noticed that the RpcAuthenticationSuccesses > readings gradually increase by +1 every 30 seconds > *TEST 1* > I made sure that > 1. Only HDFS-\{NN,DN,JN, ZKFC} and YARN-\{RM,NM} services work > 2. All other components that were – hive, spark HistoryServer, are disabled > 3. There are no YARN jobs running and no user requests to hdfs > At the time of testing, the value of RpcAuthenticationFailures indicators = 0 > RpcAuthenticationSuccesses = 208322 > To check the download, I run the spark-submit test - > spark-examples_2.12-3.5.0.jar with the number of performers = 1 > The request was completed in 1 minute and 20 seconds > RpcAuthenticationSuccesses = 208338 > In total, +16 was added to the original value during execution > Let's say +2 can be attributed to the moment I wrote about above +1 every 30 > seconds. But what does +14 authentications mean? > *TEST 2* > RpcAuthenticationFailures = 0 > RpcAuthenticationSuccesses = 208388 > hdfs dfs -ls / > RpcAuthenticationFailures = 0 > RpcAuthenticationSuccesses = 208389 > Added +1. Why? > I started kinit long before the ls/request, i.e. the metric should not have > changed, I think so, but maybe I'm wrong > *TEST 3* > disabled > - All DNs > - Satndby NN > - All YARN services (RM, NM) > still running > Three JN, ZKFC > One NN is active > The +1 counter continues to add +1 to the RpcAuthenticationSuccesses metric > every 30 seconds > Either I misunderstand the meaning of these indicators, or something is > considered wrong > Can you tell me how these indicators are calculated, I do not understand this > or is it an error in the calculations and if I do not understand the work of > these metrics, then how is it correct? > Thank you very much > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org