[jira] [Updated] (HDFS-17421) Check the correctness of the calculation RpcAuthentication*

Anatoly (Jira) Fri, 08 Mar 2024 01:38:17 -0800


     [ 
https://issues.apache.org/jira/browse/HDFS-17421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Anatoly updated HDFS-17421:
---------------------------
    Description: 
I wanted to calculate the load on the KDC in the hadoop cluster after enabling 
kerberos.

There are two parameters in hdfs metrics

 
{code:java}
RpcAuthenticationSuccesses - Total number of authentication successes
RpcAuthenticationFailures - Total number of authentication failures
{code}
 

I expect that any request to the cluster will generate a request to KDC -> get 
ticket and the request counter should trigger either +1 to one metric if 
successful or +1 to another metric if failed

However, on the test cluster, where I have 4 data Nodes and 2 NameNodes (HA), I 
see completely different indicators for these metrics.

I noticed that the RpcAuthenticationSuccesses readings are gradually increasing 
= +1 in 30 seconds

For example, before a test in a cluster
 # only HDFS-\{NN,DN,JN,ZKFC} and YARN-\{RM,NM} services work
 # All other components that were enabled – hive, spark HistoryServer are 
disabled
 # There are no YARN jobs running and no user requests to hdfs

At the time of the test, the value of the metrics

RpcAuthenticationFailures = 0

RpcAuthenticationSuccesses = 208322

 

*TEST 1*

To check the load, I run the test spark submit spark-examples_2.12-3.5.0.jar 
with num-executors 1

The request was executed for 1 min 20 sec

RpcAuthenticationSuccesses = 208338

In total, 16 points were added during the execution time

+2 can be attributed to those +1 min 30 sec.

But what does +14 points mean?

 

*TEST 2*

RpcAuthenticationFailures = 0

RpcAuthenticationSuccesses = 208388

hdfs dfs -ls /

RpcAuthenticationFailures = 0

RpcAuthenticationSuccesses = 208389

 

*TEST 3*

Turned off -

all DN

Standby NN

All YARN services

I still have

Three JN, ZKFC

One NN Active

The +1 counter continues to add +1 to the RpcAuthenticationSuccesses metric 
every 30 seconds

 

Either I don't understand the meaning of these metrics correctly or something 
is not considered right

  was:
I wanted to calculate the load on the KDC in the hadoop cluster after enabling 
kerberos.

There are two parameters in hdfs metrics

 
{code:java}
RpcAuthenticationSuccesses - Total number of authentication successes
RpcAuthenticationFailures - Total number of authentication failures
{code}
 

I expect that any request to the cluster will generate a request to KDC -> get 
ticket and the request counter should trigger either +1 to one metric if 
successful or +1 to another metric if failed

However, on the test cluster, where I have 4 data Nodes and 2 NameNodes (HA), I 
see completely different indicators for these metrics.

I noticed that the RpcAuthenticationSuccesses readings are gradually increasing 
= +1 in 30 seconds

For example, before a test in a cluster
 # only HDFS-\{NN,DN,JN,ZKFC} and YARN-\{RM,NM} services work
 # All other components that were enabled – hive, spark HistoryServer are 
disabled
 # There are no YARN jobs running and no user requests to hdfs

At the time of the test, the value of the metrics

RpcAuthenticationFailures = 0

RpcAuthenticationSuccesses = 208322

 

*TEST 1*

To check the load, I run the test spark submit spark-examples_2.12-3.5.0.jar 
with num-executors 1

The request was executed for 1 min 20 sec

RpcAuthenticationSuccesses = 208338

In total, 16 points were added during the execution time

+2 can be attributed to those +1 times in 30 seconds. But what does +14 points 
mean?

 

*TEST 2*

RpcAuthenticationFailures = 0

RpcAuthenticationSuccesses = 208388

hdfs dfs -ls /

RpcAuthenticationFailures = 0

RpcAuthenticationSuccesses = 208389

 

*TEST 3*

Turned off -

all DN

Standby NN

All YARN services

I still have

Three JN, ZKFC

One NN Active

The +1 counter continues to add +1 to the RpcAuthenticationSuccesses metric 
every 30 seconds

 

Either I don't understand the meaning of these metrics correctly or something 
is not considered right


> Check the correctness of the calculation RpcAuthentication*
> -----------------------------------------------------------
>
>                 Key: HDFS-17421
>                 URL: https://issues.apache.org/jira/browse/HDFS-17421
>             Project: Hadoop HDFS
>          Issue Type: Test
>          Components: hdfs, metrics
>            Reporter: Anatoly
>            Priority: Major
>
> I wanted to calculate the load on the KDC in the hadoop cluster after 
> enabling kerberos.
> There are two parameters in hdfs metrics
>  
> {code:java}
> RpcAuthenticationSuccesses - Total number of authentication successes
> RpcAuthenticationFailures - Total number of authentication failures
> {code}
>  
> I expect that any request to the cluster will generate a request to KDC -> 
> get ticket and the request counter should trigger either +1 to one metric if 
> successful or +1 to another metric if failed
> However, on the test cluster, where I have 4 data Nodes and 2 NameNodes (HA), 
> I see completely different indicators for these metrics.
> I noticed that the RpcAuthenticationSuccesses readings are gradually 
> increasing = +1 in 30 seconds
> For example, before a test in a cluster
>  # only HDFS-\{NN,DN,JN,ZKFC} and YARN-\{RM,NM} services work
>  # All other components that were enabled – hive, spark HistoryServer are 
> disabled
>  # There are no YARN jobs running and no user requests to hdfs
> At the time of the test, the value of the metrics
> RpcAuthenticationFailures = 0
> RpcAuthenticationSuccesses = 208322
>  
> *TEST 1*
> To check the load, I run the test spark submit spark-examples_2.12-3.5.0.jar 
> with num-executors 1
> The request was executed for 1 min 20 sec
> RpcAuthenticationSuccesses = 208338
> In total, 16 points were added during the execution time
> +2 can be attributed to those +1 min 30 sec.
> But what does +14 points mean?
>  
> *TEST 2*
> RpcAuthenticationFailures = 0
> RpcAuthenticationSuccesses = 208388
> hdfs dfs -ls /
> RpcAuthenticationFailures = 0
> RpcAuthenticationSuccesses = 208389
>  
> *TEST 3*
> Turned off -
> all DN
> Standby NN
> All YARN services
> I still have
> Three JN, ZKFC
> One NN Active
> The +1 counter continues to add +1 to the RpcAuthenticationSuccesses metric 
> every 30 seconds
>  
> Either I don't understand the meaning of these metrics correctly or something 
> is not considered right



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-17421) Check the correctness of the calculation RpcAuthentication*

Reply via email to