Anatoly created HDFS-17421: ------------------------------ Summary: Check the correctness of the calculation RpcAuthentication* Key: HDFS-17421 URL: https://issues.apache.org/jira/browse/HDFS-17421 Project: Hadoop HDFS Issue Type: Test Components: hdfs, metrics Reporter: Anatoly
I wanted to calculate the load on the KDC in the hadoop cluster after enabling kerberos. There are two parameters in hdfs metrics {code:java} RpcAuthenticationSuccesses - Total number of authentication successes RpcAuthenticationFailures - Total number of authentication failures {code} I expect that any request to the cluster will generate a request to KDC -> get ticket and the request counter should trigger either +1 to one metric if successful or +1 to another metric if failed However, on the test cluster, where I have 4 data Nodes and 2 NameNodes (HA), I see completely different indicators for these metrics. I noticed that the RpcAuthenticationSuccesses readings are gradually increasing = +1 in 30 seconds For example, before a test in a cluster # only HDFS-\{NN,DN,JN,ZKFC} and YARN-\{RM,NM} services work # All other components that were enabled – hive, spark HistoryServer are disabled # There are no YARN jobs running and no user requests to hdfs At the time of the test, the value of the metrics RpcAuthenticationFailures = 0 RpcAuthenticationSuccesses = 208322 *TEST 1* To check the load, I run the test spark submit spark-examples_2.12-3.5.0.jar with num-executors 1 The request was executed for 1 min 20 sec RpcAuthenticationSuccesses = 208338 In total, 16 points were added during the execution time +2 can be attributed to those +1 times in 30 seconds. But what does +14 points mean? *TEST 2* RpcAuthenticationFailures = 0 RpcAuthenticationSuccesses = 208388 hdfs dfs -ls / RpcAuthenticationFailures = 0 RpcAuthenticationSuccesses = 208389 *TEST 3* Turned off - all DN Standby NN All YARN services I still have Three JN, ZKFC One NN Active The +1 counter continues to add +1 to the RpcAuthenticationSuccesses metric every 30 seconds Either I don't understand the meaning of these metrics correctly or something is not considered right -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org